From 1d440b7f288532f74c05aa64c33675b7dbdf2f34 Mon Sep 17 00:00:00 2001
From: Danilo Burbano <danilo@johnsnowlabs.com>
Date: Fri, 5 Jul 2024 10:58:06 -0500
Subject: [PATCH] [SPARKNLP-1015] Restructuring Readme and Documentation

---
 README.md                    | 1227 +++-------------------------------
 docs/_data/navigation.yml    |    6 +
 docs/en/advanced_settings.md |  142 ++++
 docs/en/features.md          |  120 ++++
 docs/en/install.md           |  435 +++++++++++-
 docs/en/pipelines.md         | 1035 +++-------------------------
 6 files changed, 906 insertions(+), 2059 deletions(-)
 create mode 100644 docs/en/advanced_settings.md
 create mode 100644 docs/en/features.md

diff --git a/README.md b/README.md
index cb7c32736e8638..fe8f9fe9fcc625 100644
--- a/README.md
+++ b/README.md
@@ -29,148 +29,17 @@ It also offers tasks such as **Tokenization**, **Word Segmentation**, **Part-of-
 Take a look at our official Spark NLP page: [https://sparknlp.org/](https://sparknlp.org/) for user
 documentation and examples
 
-## Community support
-
-- [Slack](https://join.slack.com/t/spark-nlp/shared_invite/zt-198dipu77-L3UWNe_AJ8xqDk0ivmih5Q) For live discussion with the Spark NLP community and the team
-- [GitHub](https://github.com/JohnSnowLabs/spark-nlp) Bug reports, feature requests, and contributions
-- [Discussions](https://github.com/JohnSnowLabs/spark-nlp/discussions) Engage with other community members, share ideas,
-  and show off how you use Spark NLP!
-- [Medium](https://medium.com/spark-nlp) Spark NLP articles
-- [YouTube](https://www.youtube.com/channel/UCmFOjlpYEhxf_wJUDuz6xxQ/videos) Spark NLP video tutorials
-
-## Table of contents
-
-- [Features](#features)
-- [Requirements](#requirements)
-- [Quick Start](#quick-start)
-- [Apache Spark Support](#apache-spark-support)
-- [Scala & Python Support](#scala-and-python-support)
-- [Databricks Support](#databricks-support)
-- [EMR Support](#emr-support)
-- [Using Spark NLP](#usage)
-  - [Packages Cheatsheet](#packages-cheatsheet)
-  - [Spark Packages](#spark-packages)
-  - [Scala](#scala)
-    - [Maven](#maven)
-    - [SBT](#sbt)
-  - [Python](#python)
-    - [Pip/Conda](#pipconda)
-  - [Compiled JARs](#compiled-jars)
-  - [Apache Zeppelin](#apache-zeppelin)
-  - [Jupyter Notebook](#jupyter-notebook-python)
-  - [Google Colab Notebook](#google-colab-notebook)
-  - [Kaggle Kernel](#kaggle-kernel)
-  - [Databricks Cluster](#databricks-cluster)
-  - [EMR Cluster](#emr-cluster)
-  - [GCP Dataproc](#gcp-dataproc)
-  - [Spark NLP Configuration](#spark-nlp-configuration)
-- [Pipelines & Models](#pipelines-and-models)
-  - [Pipelines](#pipelines)
-  - [Models](#models)
-- [Offline](#offline)
-- [Examples](#examples)
-- [FAQ](#faq)
-- [Citation](#citation)
-- [Contributing](#contributing)
-
 ## Features
-
-- Tokenization
-- Trainable Word Segmentation
-- Stop Words Removal
-- Token Normalizer
-- Document Normalizer
-- Document & Text Splitter
-- Stemmer
-- Lemmatizer
-- NGrams
-- Regex Matching
-- Text Matching
-- Chunking
-- Date Matcher
-- Sentence Detector
-- Deep Sentence Detector (Deep learning)
-- Dependency parsing (Labeled/unlabeled)
-- SpanBertCorefModel (Coreference Resolution)
-- Part-of-speech tagging
-- Sentiment Detection (ML models)
-- Spell Checker (ML and DL models)
-- Word Embeddings (GloVe and Word2Vec)
-- Doc2Vec (based on Word2Vec)
-- BERT Embeddings (TF Hub & HuggingFace models)
-- DistilBERT Embeddings (HuggingFace models)
-- CamemBERT Embeddings (HuggingFace models)
-- RoBERTa Embeddings (HuggingFace models)
-- DeBERTa Embeddings (HuggingFace v2 & v3 models)
-- XLM-RoBERTa Embeddings (HuggingFace models)
-- Longformer Embeddings (HuggingFace models)
-- ALBERT Embeddings (TF Hub & HuggingFace models)
-- XLNet Embeddings
-- ELMO Embeddings (TF Hub models)
-- Universal Sentence Encoder (TF Hub models)
-- BERT Sentence Embeddings (TF Hub & HuggingFace models)
-- RoBerta Sentence Embeddings (HuggingFace models)
-- XLM-RoBerta Sentence Embeddings (HuggingFace models)
-- INSTRUCTOR Embeddings (HuggingFace models)
-- E5 Embeddings (HuggingFace models)
-- MPNet Embeddings (HuggingFace models)
-- UAE Embeddings (HuggingFace models)
-- OpenAI Embeddings
-- Sentence & Chunk Embeddings
-- Unsupervised keywords extraction
-- Language Detection & Identification (up to 375 languages)
-- Multi-class & Multi-labe Sentiment analysis (Deep learning)
-- Multi-class Text Classification (Deep learning)
-- BERT for Token & Sequence Classification & Question Answering
-- DistilBERT for Token & Sequence Classification & Question Answering
-- CamemBERT for Token & Sequence Classification & Question Answering
-- ALBERT for Token & Sequence Classification & Question Answering
-- RoBERTa for Token & Sequence Classification & Question Answering
-- DeBERTa for Token & Sequence Classification & Question Answering
-- XLM-RoBERTa for Token & Sequence Classification & Question Answering
-- Longformer for Token & Sequence Classification & Question Answering
-- MPnet for Token & Sequence Classification & Question Answering
-- XLNet for Token & Sequence Classification
-- Zero-Shot NER Model
-- Zero-Shot Text Classification by Transformers (ZSL)
-- Neural Machine Translation (MarianMT)
-- Many-to-Many multilingual translation model (Facebook M2M100)
-- Table Question Answering (TAPAS)
-- Text-To-Text Transfer Transformer (Google T5)
-- Generative Pre-trained Transformer 2 (OpenAI GPT2)
-- Seq2Seq for NLG, Translation, and Comprehension (Facebook BART)
-- Chat and Conversational LLMs (Facebook Llama-2)
-- Vision Transformer (Google ViT)
-- Swin Image Classification (Microsoft Swin Transformer)
-- ConvNext Image Classification (Facebook ConvNext)
-- Vision Encoder Decoder for image-to-text like captioning
-- Zero-Shot Image Classification by OpenAI's CLIP
-- Automatic Speech Recognition (Wav2Vec2)
-- Automatic Speech Recognition (HuBERT)
-- Automatic Speech Recognition (OpenAI Whisper)
-- Named entity recognition (Deep learning)
-- Easy ONNX, OpenVINO, and TensorFlow integrations
-- GPU Support
-- Full integration with Spark ML functions
-- +31000 pre-trained models in +200 languages!
-- +6000 pre-trained pipelines in +200 languages!
-- Multi-lingual NER models: Arabic, Bengali, Chinese, Danish, Dutch, English, Finnish, French, German, Hebrew, Italian,
-  Japanese, Korean, Norwegian, Persian, Polish, Portuguese, Russian, Spanish, Swedish, Urdu, and more.
-
-## Requirements
-
-To use Spark NLP you need the following requirements:
-
-- Java 8 and 11
-- Apache Spark 3.5.x, 3.4.x, 3.3.x, 3.2.x, 3.1.x, 3.0.x
-
-**GPU (optional):**
-
-Spark NLP 5.4.0 is built with ONNX 1.17.0 and TensorFlow 2.7.1 deep learning engines. The minimum following NVIDIA® software are only required for GPU support:
-
-- NVIDIA® GPU drivers version 450.80.02 or higher
-- CUDA® Toolkit 11.2
-- cuDNN SDK 8.1.0
+- [Text Preprocessing](https://sparknlp.org/docs/en/features#text-preproccesing)
+- [Parsing and Analysis](https://sparknlp.org/docs/en/features#parsing-and-analysis)
+- [Sentiment and Classification](https://sparknlp.org/docs/en/features#sentiment-and-classification)
+- [Embeddings](https://sparknlp.org/docs/en/features#embeddings)
+- [Classification and Question Answering](https://sparknlp.org/docs/en/features#classification-and-question-answering-models)
+- [Machine Translation and Generation](https://sparknlp.org/docs/en/features#machine-translation-and-generation)
+- [Image and Speech](https://sparknlp.org/docs/en/features#image-and-speech)
+- [Integration and Interoperability (ONNX, OpenVINO)](https://sparknlp.org/docs/en/features#integration-and-interoperability)
+- [Pre-trained Models (36000+ in +200 languages)](https://sparknlp.org/docs/en/features#pre-trained-models)
+- [Multi-lingual Support](https://sparknlp.org/docs/en/features#multi-lingual-support)
 
 ## Quick Start
 
@@ -225,7 +94,27 @@ Output: ['Mona Lisa', 'Leonardo', 'Louvre', 'Paris']
 
 For more examples, you can visit our dedicated [examples](https://github.com/JohnSnowLabs/spark-nlp/tree/master/examples) to showcase all Spark NLP use cases!
 
-## Apache Spark Support
+### Packages Cheatsheet
+
+This is a cheatsheet for corresponding Spark NLP Maven package to Apache Spark / PySpark major version:
+
+| Apache Spark            | Spark NLP on CPU   | Spark NLP on GPU           | Spark NLP on AArch64 (linux)   | Spark NLP on Apple Silicon           |
+|-------------------------|--------------------|----------------------------|--------------------------------|--------------------------------------|
+| 3.0/3.1/3.2/3.3/3.4/3.5 | `spark-nlp`        | `spark-nlp-gpu`            | `spark-nlp-aarch64`            | `spark-nlp-silicon`                  |
+| Start Function          | `sparknlp.start()` | `sparknlp.start(gpu=True)` | `sparknlp.start(aarch64=True)` | `sparknlp.start(apple_silicon=True)` |
+
+NOTE: `M1/M2` and `AArch64` are under `experimental` support. Access and support to these architectures are limited by the
+community and we had to build most of the dependencies by ourselves to make them compatible. We support these two
+architectures, however, they may not work in some environments.
+
+## Pipelines and Models
+For a quick example of using pipelines and models take a look at our official [documentation](https://sparknlp.org/docs/en/install#pipelines-and-models)
+
+#### Please check out our Models Hub for the full list of [pre-trained models](https://sparknlp.org/models) with examples, demo, benchmark, and more
+
+## Platform and Ecosystem Support
+
+### Apache Spark Support
 
 Spark NLP *5.4.0* has been built on top of Apache Spark 3.4 while fully supports Apache Spark 3.0.x, 3.1.x, 3.2.x, 3.3.x, 3.4.x, and 3.5.x
 
@@ -236,15 +125,10 @@ Spark NLP *5.4.0* has been built on top of Apache Spark 3.4 while fully supports
 | 5.2.x     | YES                | YES                | YES                | YES                | YES                | YES                | NO                 | NO                 |
 | 5.1.x     | Partially          | YES                | YES                | YES                | YES                | YES                | NO                 | NO                 |
 | 5.0.x     | YES                | YES                | YES                | YES                | YES                | YES                | NO                 | NO                 |
-| 4.4.x     | YES                | YES                | YES                | YES                | YES                | YES                | NO                 | NO                 |
-| 4.3.x     | NO                 | NO                 | YES                | YES                | YES                | YES                | NO                 | NO                 |
-| 4.2.x     | NO                 | NO                 | YES                | YES                | YES                | YES                | NO                 | NO                 |
-| 4.1.x     | NO                 | NO                 | YES                | YES                | YES                | YES                | NO                 | NO                 |
-| 4.0.x     | NO                 | NO                 | YES                | YES                | YES                | YES                | NO                 | NO                 |
 
 Find out more about `Spark NLP` versions from our [release notes](https://github.com/JohnSnowLabs/spark-nlp/releases).
 
-## Scala and Python Support
+### Scala and Python Support
 
 | Spark NLP | Python 3.6 | Python 3.7 | Python 3.8 | Python 3.9 | Python 3.10| Scala 2.11 | Scala 2.12 |
 |-----------|------------|------------|------------|------------|------------|------------|------------|
@@ -252,737 +136,87 @@ Find out more about `Spark NLP` versions from our [release notes](https://github
 | 5.2.x     | NO         | YES        | YES        | YES        | YES        | NO         | YES        |
 | 5.1.x     | NO         | YES        | YES        | YES        | YES        | NO         | YES        |
 | 5.0.x     | NO         | YES        | YES        | YES        | YES        | NO         | YES        |
-| 4.4.x     | NO         | YES        | YES        | YES        | YES        | NO         | YES        |
-| 4.3.x     | YES        | YES        | YES        | YES        | YES        | NO         | YES        |
-| 4.2.x     | YES        | YES        | YES        | YES        | YES        | NO         | YES        |
-| 4.1.x     | YES        | YES        | YES        | YES        | NO         | NO         | YES        |
-| 4.0.x     | YES        | YES        | YES        | YES        | NO         | NO         | YES        |
 
-## Databricks Support
+Find out more about 4.x `SparkNLP` versions in our official [documentation](https://sparknlp.org/docs/en/install#apache-spark-support)
+
+### Databricks Support
 
 Spark NLP 5.4.0 has been tested and is compatible with the following runtimes:
 
-**CPU:**
-
-- 9.1
-- 9.1 ML
-- 10.1
-- 10.1 ML
-- 10.2
-- 10.2 ML
-- 10.3
-- 10.3 ML
-- 10.4
-- 10.4 ML
-- 10.5
-- 10.5 ML
-- 11.0
-- 11.0 ML
-- 11.1
-- 11.1 ML
-- 11.2
-- 11.2 ML
-- 11.3
-- 11.3 ML
-- 12.0
-- 12.0 ML
-- 12.1
-- 12.1 ML
-- 12.2
-- 12.2 ML
-- 13.0
-- 13.0 ML
-- 13.1
-- 13.1 ML
-- 13.2
-- 13.2 ML
-- 13.3
-- 13.3 ML
-- 14.0
-- 14.0 ML
-- 14.1
-- 14.1 ML
-- 14.2
-- 14.2 ML
-- 14.3
-- 14.3 ML
-
-**GPU:**
-
-- 9.1 ML & GPU
-- 10.1 ML & GPU
-- 10.2 ML & GPU
-- 10.3 ML & GPU
-- 10.4 ML & GPU
-- 10.5 ML & GPU
-- 11.0 ML & GPU
-- 11.1 ML & GPU
-- 11.2 ML & GPU
-- 11.3 ML & GPU
-- 12.0 ML & GPU
-- 12.1 ML & GPU
-- 12.2 ML & GPU
-- 13.0 ML & GPU
-- 13.1 ML & GPU
-- 13.2 ML & GPU
-- 13.3 ML & GPU
-- 14.0 ML & GPU
-- 14.1 ML & GPU
-- 14.2 ML & GPU
-- 14.3 ML & GPU
-
-## EMR Support
+| **CPU**            | **GPU**            |
+|--------------------|--------------------|
+| 14.0 / 14.0 ML     | 14.0 ML & GPU      |
+| 14.1 / 14.1 ML     | 14.1 ML & GPU      |
+| 14.2 / 14.2 ML     | 14.2 ML & GPU      |
+| 14.3 / 14.3 ML     | 14.3 ML & GPU      |
+
+We are compatible with older runtimes. For a full list check databricks support in our official [documentation](https://sparknlp.org/docs/en/install#databricks-support)
+
+### EMR Support
 
 Spark NLP 5.4.0 has been tested and is compatible with the following EMR releases:
 
-- emr-6.2.0
-- emr-6.3.0
-- emr-6.3.1
-- emr-6.4.0
-- emr-6.5.0
-- emr-6.6.0
-- emr-6.7.0
-- emr-6.8.0
-- emr-6.9.0
-- emr-6.10.0
-- emr-6.11.0
-- emr-6.12.0
-- emr-6.13.0
-- emr-6.14.0
-- emr-6.15.0
-- emr-7.0.0
+| **EMR Release**    |
+|--------------------|
+| emr-6.13.0         |
+| emr-6.14.0         |
+| emr-6.15.0         |
+| emr-7.0.0          |
+
+We are compatible with older EMR releases. For a full list check EMR support in our official [documentation](https://sparknlp.org/docs/en/install#emr-support)
 
 Full list of [Amazon EMR 6.x releases](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-release-6x.html)
 Full list of [Amazon EMR 7.x releases](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-release-7x.html)
 
 NOTE: The EMR 6.1.0 and 6.1.1 are not supported.
 
-## Usage
-
-## Packages Cheatsheet
-
-This is a cheatsheet for corresponding Spark NLP Maven package to Apache Spark / PySpark major version:
-
-| Apache Spark            | Spark NLP on CPU   | Spark NLP on GPU           | Spark NLP on AArch64 (linux)   | Spark NLP on Apple Silicon           |
-|-------------------------|--------------------|----------------------------|--------------------------------|--------------------------------------|
-| 3.0/3.1/3.2/3.3/3.4/3.5 | `spark-nlp`        | `spark-nlp-gpu`            | `spark-nlp-aarch64`            | `spark-nlp-silicon`                  |
-| Start Function          | `sparknlp.start()` | `sparknlp.start(gpu=True)` | `sparknlp.start(aarch64=True)` | `sparknlp.start(apple_silicon=True)` |
-
-NOTE: `M1/M2` and `AArch64` are under `experimental` support. Access and support to these architectures are limited by the
-community and we had to build most of the dependencies by ourselves to make them compatible. We support these two
-architectures, however, they may not work in some environments.
-
-## Spark Packages
+## Installation
 
 ### Command line (requires internet connection)
+To install spark-nlp packages through command line follow [these instructions](https://sparknlp.org/docs/en/install#command-line) from our official documentation
 
-Spark NLP supports all major releases of Apache Spark 3.0.x, Apache Spark 3.1.x, Apache Spark 3.2.x, Apache Spark 3.3.x, Apache Spark 3.4.x, and Apache Spark 3.5.x
-
-#### Apache Spark 3.x (3.0.x, 3.1.x, 3.2.x, 3.3.x, 3.4.x, and 3.5.x - Scala 2.12)
-
-```sh
-# CPU
-
-spark-shell --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0
-
-pyspark --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0
-
-spark-submit --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0
-```
-
-The `spark-nlp` has been published to
-the [Maven Repository](https://mvnrepository.com/artifact/com.johnsnowlabs.nlp/spark-nlp).
-
-```sh
-# GPU
-
-spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-gpu_2.12:5.4.0
-
-pyspark --packages com.johnsnowlabs.nlp:spark-nlp-gpu_2.12:5.4.0
-
-spark-submit --packages com.johnsnowlabs.nlp:spark-nlp-gpu_2.12:5.4.0
-
-```
-
-The `spark-nlp-gpu` has been published to
-the [Maven Repository](https://mvnrepository.com/artifact/com.johnsnowlabs.nlp/spark-nlp-gpu).
-
-```sh
-# AArch64
-
-spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-aarch64_2.12:5.4.0
-
-pyspark --packages com.johnsnowlabs.nlp:spark-nlp-aarch64_2.12:5.4.0
-
-spark-submit --packages com.johnsnowlabs.nlp:spark-nlp-aarch64_2.12:5.4.0
-
-```
-
-The `spark-nlp-aarch64` has been published to
-the [Maven Repository](https://mvnrepository.com/artifact/com.johnsnowlabs.nlp/spark-nlp-aarch64).
-
-```sh
-# M1/M2 (Apple Silicon)
-
-spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-silicon_2.12:5.4.0
-
-pyspark --packages com.johnsnowlabs.nlp:spark-nlp-silicon_2.12:5.4.0
-
-spark-submit --packages com.johnsnowlabs.nlp:spark-nlp-silicon_2.12:5.4.0
-
-```
-
-The `spark-nlp-silicon` has been published to
-the [Maven Repository](https://mvnrepository.com/artifact/com.johnsnowlabs.nlp/spark-nlp-silicon).
-
-**NOTE**: In case you are using large pretrained models like UniversalSentenceEncoder, you need to have the following
-set in your SparkSession:
-
-```sh
-spark-shell \
-  --driver-memory 16g \
-  --conf spark.kryoserializer.buffer.max=2000M \
-  --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0
-```
-
-## Scala
+### Scala
 
 Spark NLP supports Scala 2.12.15 if you are using Apache Spark 3.0.x, 3.1.x, 3.2.x, 3.3.x, and 3.4.x versions. Our packages are
-deployed to Maven central. To add any of our packages as a dependency in your application you can follow these
-coordinates:
-
-### Maven
-
-**spark-nlp** on Apache Spark 3.0.x, 3.1.x, 3.2.x, 3.3.x, 3.4.x, and 3.5.x:
-
-```xml
-<!-- https://mvnrepository.com/artifact/com.johnsnowlabs.nlp/spark-nlp -->
-<dependency>
-    <groupId>com.johnsnowlabs.nlp</groupId>
-    <artifactId>spark-nlp_2.12</artifactId>
-    <version>5.4.0</version>
-</dependency>
-```
-
-**spark-nlp-gpu:**
-
-```xml
-<!-- https://mvnrepository.com/artifact/com.johnsnowlabs.nlp/spark-nlp-gpu -->
-<dependency>
-    <groupId>com.johnsnowlabs.nlp</groupId>
-    <artifactId>spark-nlp-gpu_2.12</artifactId>
-    <version>5.4.0</version>
-</dependency>
-```
-
-**spark-nlp-aarch64:**
-
-```xml
-<!-- https://mvnrepository.com/artifact/com.johnsnowlabs.nlp/spark-nlp-aarch64 -->
-<dependency>
-    <groupId>com.johnsnowlabs.nlp</groupId>
-    <artifactId>spark-nlp-aarch64_2.12</artifactId>
-    <version>5.4.0</version>
-</dependency>
-```
-
-**spark-nlp-silicon:**
-
-```xml
-<!-- https://mvnrepository.com/artifact/com.johnsnowlabs.nlp/spark-nlp-silicon -->
-<dependency>
-    <groupId>com.johnsnowlabs.nlp</groupId>
-    <artifactId>spark-nlp-silicon_2.12</artifactId>
-    <version>5.4.0</version>
-</dependency>
-```
-
-### SBT
-
-**spark-nlp** on Apache Spark 3.0.x, 3.1.x, 3.2.x, 3.3.x, 3.4.x, and 3.5.x:
-
-```sbtshell
-// https://mvnrepository.com/artifact/com.johnsnowlabs.nlp/spark-nlp
-libraryDependencies += "com.johnsnowlabs.nlp" %% "spark-nlp" % "5.4.0"
-```
-
-**spark-nlp-gpu:**
-
-```sbtshell
-// https://mvnrepository.com/artifact/com.johnsnowlabs.nlp/spark-nlp-gpu
-libraryDependencies += "com.johnsnowlabs.nlp" %% "spark-nlp-gpu" % "5.4.0"
-```
-
-**spark-nlp-aarch64:**
-
-```sbtshell
-// https://mvnrepository.com/artifact/com.johnsnowlabs.nlp/spark-nlp-aarch64
-libraryDependencies += "com.johnsnowlabs.nlp" %% "spark-nlp-aarch64" % "5.4.0"
-```
-
-**spark-nlp-silicon:**
-
-```sbtshell
-// https://mvnrepository.com/artifact/com.johnsnowlabs.nlp/spark-nlp-silicon
-libraryDependencies += "com.johnsnowlabs.nlp" %% "spark-nlp-silicon" % "5.4.0"
-```
-
-Maven
-Central: [https://mvnrepository.com/artifact/com.johnsnowlabs.nlp](https://mvnrepository.com/artifact/com.johnsnowlabs.nlp)
+deployed to Maven central. To add any of our packages as a dependency in your application you can follow [these instructions](https://sparknlp.org/docs/en/install#scala-and-java)
+from our official documentation.
 
 If you are interested, there is a simple SBT project for Spark NLP to guide you on how to use it in your
 projects [Spark NLP SBT Starter](https://github.com/maziyarpanahi/spark-nlp-starter)
 
-## Python
-
-Spark NLP supports Python 3.6.x and above depending on your major PySpark version.
-
-### Python without explicit Pyspark installation
-
-### Pip/Conda
-
-If you installed pyspark through pip/conda, you can install `spark-nlp` through the same channel.
-
-Pip:
-
-```bash
-pip install spark-nlp==5.4.0
-```
-
-Conda:
-
-```bash
-conda install -c johnsnowlabs spark-nlp
-```
-
-PyPI [spark-nlp package](https://pypi.org/project/spark-nlp/) /
-Anaconda [spark-nlp package](https://anaconda.org/JohnSnowLabs/spark-nlp)
-
-Then you'll have to create a SparkSession either from Spark NLP:
-
-```python
-import sparknlp
-
-spark = sparknlp.start()
-```
-
-or manually:
-
-```python
-spark = SparkSession.builder
-    .appName("Spark NLP")
-    .master("local[*]")
-    .config("spark.driver.memory", "16G")
-    .config("spark.driver.maxResultSize", "0")
-    .config("spark.kryoserializer.buffer.max", "2000M")
-    .config("spark.jars.packages", "com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0")
-    .getOrCreate()
-```
-
-If using local jars, you can use `spark.jars` instead for comma-delimited jar files. For cluster setups, of course,
-you'll have to put the jars in a reachable location for all driver and executor nodes.
-
-**Quick example:**
-
-```python
-import sparknlp
-from sparknlp.pretrained import PretrainedPipeline
-
-# create or get Spark Session
-
-spark = sparknlp.start()
-
-sparknlp.version()
-spark.version
-
-# download, load and annotate a text by pre-trained pipeline
-
-pipeline = PretrainedPipeline('recognize_entities_dl', 'en')
-result = pipeline.annotate('The Mona Lisa is a 16th century oil painting created by Leonardo')
-```
-
-## Compiled JARs
-
-### Build from source
-
-#### spark-nlp
-
-- FAT-JAR for CPU on Apache Spark 3.0.x, 3.1.x, 3.2.x, 3.3.x, 3.4.x, and 3.5.x
-
-```bash
-sbt assembly
-```
-
-- FAT-JAR for GPU on Apache Spark 3.0.x, 3.1.x, 3.2.x, 3.3.x, 3.4.x, and 3.5.x
-
-```bash
-sbt -Dis_gpu=true assembly
-```
-
-- FAT-JAR for M! on Apache Spark 3.0.x, 3.1.x, 3.2.x, 3.3.x, 3.4.x, and 3.5.x
-
-```bash
-sbt -Dis_silicon=true assembly
-```
-
-### Using the jar manually
-
-If for some reason you need to use the JAR, you can either download the Fat JARs provided here or download it
-from [Maven Central](https://mvnrepository.com/artifact/com.johnsnowlabs.nlp).
-
-To add JARs to spark programs use the `--jars` option:
-
-```sh
-spark-shell --jars spark-nlp.jar
-```
-
-The preferred way to use the library when running spark programs is using the `--packages` option as specified in
-the `spark-packages` section.
-
-## Apache Zeppelin
-
-Use either one of the following options
-
-- Add the following Maven Coordinates to the interpreter's library list
-
-```bash
-com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0
-```
-
-- Add a path to pre-built jar from [here](#compiled-jars) in the interpreter's library list making sure the jar is
-  available to driver path
-
-### Python in Zeppelin
-
-Apart from the previous step, install the python module through pip
-
-```bash
-pip install spark-nlp==5.4.0
-```
-
-Or you can install `spark-nlp` from inside Zeppelin by using Conda:
-
-```bash
-python.conda install -c johnsnowlabs spark-nlp
-```
-
-Configure Zeppelin properly, use cells with %spark.pyspark or any interpreter name you chose.
-
-Finally, in Zeppelin interpreter settings, make sure you set properly zeppelin.python to the python you want to use and
-install the pip library with (e.g. `python3`).
-
-An alternative option would be to set `SPARK_SUBMIT_OPTIONS` (zeppelin-env.sh) and make sure `--packages` is there as
-shown earlier since it includes both scala and python side installation.
-
-## Jupyter Notebook (Python)
-
-**Recommended:**
-
-The easiest way to get this done on Linux and macOS is to simply install `spark-nlp` and `pyspark` PyPI packages and
-launch the Jupyter from the same Python environment:
-
-```sh
-$ conda create -n sparknlp python=3.8 -y
-$ conda activate sparknlp
-# spark-nlp by default is based on pyspark 3.x
-$ pip install spark-nlp==5.4.0 pyspark==3.3.1 jupyter
-$ jupyter notebook
-```
-
-Then you can use `python3` kernel to run your code with creating SparkSession via `spark = sparknlp.start()`.
-
-**Optional:**
-
-If you are in different operating systems and require to make Jupyter Notebook run by using pyspark, you can follow
-these steps:
-
-```bash
-export SPARK_HOME=/path/to/your/spark/folder
-export PYSPARK_PYTHON=python3
-export PYSPARK_DRIVER_PYTHON=jupyter
-export PYSPARK_DRIVER_PYTHON_OPTS=notebook
-
-pyspark --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0
-```
-
-Alternatively, you can mix in using `--jars` option for pyspark + `pip install spark-nlp`
-
-If not using pyspark at all, you'll have to run the instructions
-pointed [here](#python-without-explicit-pyspark-installation)
-
-## Google Colab Notebook
-
-Google Colab is perhaps the easiest way to get started with spark-nlp. It requires no installation or setup other than
-having a Google account.
-
-Run the following code in Google Colab notebook and start using spark-nlp right away.
-
-```sh
-# This is only to setup PySpark and Spark NLP on Colab
-!wget https://setup.johnsnowlabs.com/colab.sh -O - | bash
-```
-
-This script comes with the two options to define `pyspark` and `spark-nlp` versions via options:
-
-```sh
-# -p is for pyspark
-# -s is for spark-nlp
-# -g will enable upgrading libcudnn8 to 8.1.0 on Google Colab for GPU usage
-# by default they are set to the latest
-!wget https://setup.johnsnowlabs.com/colab.sh -O - | bash /dev/stdin -p 3.2.3 -s 5.4.0
-```
-
-[Spark NLP quick start on Google Colab](https://colab.research.google.com/github/JohnSnowLabs/spark-nlp/blob/master/examples/python/quick_start_google_colab.ipynb)
-is a live demo on Google Colab that performs named entity recognitions and sentiment analysis by using Spark NLP
-pretrained pipelines.
-
-## Kaggle Kernel
-
-Run the following code in Kaggle Kernel and start using spark-nlp right away.
-
-```sh
-# Let's setup Kaggle for Spark NLP and PySpark
-!wget https://setup.johnsnowlabs.com/kaggle.sh -O - | bash
-```
-
-This script comes with the two options to define `pyspark` and `spark-nlp` versions via options:
-
-```sh
-# -p is for pyspark
-# -s is for spark-nlp
-# -g will enable upgrading libcudnn8 to 8.1.0 on Kaggle for GPU usage
-# by default they are set to the latest
-!wget https://setup.johnsnowlabs.com/colab.sh -O - | bash /dev/stdin -p 3.2.3 -s 5.4.0
-```
-
-[Spark NLP quick start on Kaggle Kernel](https://www.kaggle.com/mozzie/spark-nlp-named-entity-recognition) is a live
-demo on Kaggle Kernel that performs named entity recognitions by using Spark NLP pretrained pipeline.
-
-## Databricks Cluster
-
-1. Create a cluster if you don't have one already
-
-2. On a new cluster or existing one you need to add the following to the `Advanced Options -> Spark` tab:
-
-    ```bash
-    spark.kryoserializer.buffer.max 2000M
-    spark.serializer org.apache.spark.serializer.KryoSerializer
-    ```
-
-3. In `Libraries` tab inside your cluster you need to follow these steps:
-
-   3.1. Install New -> PyPI -> `spark-nlp==5.4.0` -> Install
-
-   3.2. Install New -> Maven -> Coordinates -> `com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0` -> Install
-
-4. Now you can attach your notebook to the cluster and use Spark NLP!
-
-NOTE: Databricks' runtimes support different Apache Spark major releases. Please make sure you choose the correct Spark
-NLP Maven package name (Maven Coordinate) for your runtime from
-our [Packages Cheatsheet](https://github.com/JohnSnowLabs/spark-nlp#packages-cheatsheet)
-
-## EMR Cluster
-
-To launch EMR clusters with Apache Spark/PySpark and Spark NLP correctly you need to have bootstrap and software
-configuration.
-
-A sample of your bootstrap script
-
-```.sh
-#!/bin/bash
-set -x -e
-
-echo -e 'export PYSPARK_PYTHON=/usr/bin/python3
-export HADOOP_CONF_DIR=/etc/hadoop/conf
-export SPARK_JARS_DIR=/usr/lib/spark/jars
-export SPARK_HOME=/usr/lib/spark' >> $HOME/.bashrc && source $HOME/.bashrc
-
-sudo python3 -m pip install awscli boto spark-nlp
-
-set +x
-exit 0
-
-```
-
-A sample of your software configuration in JSON on S3 (must be public access):
-
-```.json
-[{
-  "Classification": "spark-env",
-  "Configurations": [{
-    "Classification": "export",
-    "Properties": {
-      "PYSPARK_PYTHON": "/usr/bin/python3"
-    }
-  }]
-},
-{
-  "Classification": "spark-defaults",
-    "Properties": {
-      "spark.yarn.stagingDir": "hdfs:///tmp",
-      "spark.yarn.preserve.staging.files": "true",
-      "spark.kryoserializer.buffer.max": "2000M",
-      "spark.serializer": "org.apache.spark.serializer.KryoSerializer",
-      "spark.driver.maxResultSize": "0",
-      "spark.jars.packages": "com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0"
-    }
-}]
-```
-
-A sample of AWS CLI to launch EMR cluster:
-
-```.sh
-aws emr create-cluster \
---name "Spark NLP 5.4.0" \
---release-label emr-6.2.0 \
---applications Name=Hadoop Name=Spark Name=Hive \
---instance-type m4.4xlarge \
---instance-count 3 \
---use-default-roles \
---log-uri "s3://<S3_BUCKET>/" \
---bootstrap-actions Path=s3://<S3_BUCKET>/emr-bootstrap.sh,Name=custome \
---configurations "https://<public_access>/sparknlp-config.json" \
---ec2-attributes KeyName=<your_ssh_key>,EmrManagedMasterSecurityGroup=<security_group_with_ssh>,EmrManagedSlaveSecurityGroup=<security_group_with_ssh> \
---profile <aws_profile_credentials>
-```
-
-## GCP Dataproc
-
-1. Create a cluster if you don't have one already as follows.
-
-At gcloud shell:
-
-```bash
-gcloud services enable dataproc.googleapis.com \
-  compute.googleapis.com \
-  storage-component.googleapis.com \
-  bigquery.googleapis.com \
-  bigquerystorage.googleapis.com
-```
-
-```bash
-REGION=<region>
-```
-
-```bash
-BUCKET_NAME=<bucket_name>
-gsutil mb -c standard -l ${REGION} gs://${BUCKET_NAME}
-```
-
-```bash
-REGION=<region>
-ZONE=<zone>
-CLUSTER_NAME=<cluster_name>
-BUCKET_NAME=<bucket_name>
-```
-
-You can set image-version, master-machine-type, worker-machine-type,
-master-boot-disk-size, worker-boot-disk-size, num-workers as your needs.
-If you use the previous image-version from 2.0, you should also add ANACONDA to optional-components.
-And, you should enable gateway.
-Don't forget to set the maven coordinates for the jar in properties.
-
-```bash
-gcloud dataproc clusters create ${CLUSTER_NAME} \
-  --region=${REGION} \
-  --zone=${ZONE} \
-  --image-version=2.0 \
-  --master-machine-type=n1-standard-4 \
-  --worker-machine-type=n1-standard-2 \
-  --master-boot-disk-size=128GB \
-  --worker-boot-disk-size=128GB \
-  --num-workers=2 \
-  --bucket=${BUCKET_NAME} \
-  --optional-components=JUPYTER \
-  --enable-component-gateway \
-  --metadata 'PIP_PACKAGES=spark-nlp spark-nlp-display google-cloud-bigquery google-cloud-storage' \
-  --initialization-actions gs://goog-dataproc-initialization-actions-${REGION}/python/pip-install.sh \
-  --properties spark:spark.serializer=org.apache.spark.serializer.KryoSerializer,spark:spark.driver.maxResultSize=0,spark:spark.kryoserializer.buffer.max=2000M,spark:spark.jars.packages=com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0
-```
-
-2. On an existing one, you need to install spark-nlp and spark-nlp-display packages from PyPI.
+### Python
 
-3. Now, you can attach your notebook to the cluster and use the Spark NLP!
+Spark NLP supports Python 3.7.x and above depending on your major PySpark version.
+Check all available installations for Python in our official [documentation](https://sparknlp.org/docs/en/install#python)
 
-## Spark NLP Configuration
 
-You can change the following Spark NLP configurations via Spark Configuration:
+### Compiled JARs
+To compile the jars from source follow [these instructions](https://sparknlp.org/docs/en/compiled#jars) from our official documenation
 
-| Property Name                                           | Default              | Meaning                                                                                                                                                                                                                                                                            |
-|---------------------------------------------------------|----------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| `spark.jsl.settings.pretrained.cache_folder`            | `~/cache_pretrained` | The location to download and extract pretrained `Models` and `Pipelines`. By default, it will be in User's Home directory under `cache_pretrained` directory                                                                                                                       |
-| `spark.jsl.settings.storage.cluster_tmp_dir`            | `hadoop.tmp.dir`     | The location to use on a cluster for temporarily files such as unpacking indexes for WordEmbeddings. By default, this locations is the location of `hadoop.tmp.dir` set via Hadoop configuration for Apache Spark. NOTE: `S3` is not supported and it must be local, HDFS, or DBFS |
-| `spark.jsl.settings.annotator.log_folder`               | `~/annotator_logs`   | The location to save logs from annotators during training such as `NerDLApproach`, `ClassifierDLApproach`, `SentimentDLApproach`, `MultiClassifierDLApproach`, etc. By default, it will be in User's Home directory under `annotator_logs` directory                               |
-| `spark.jsl.settings.aws.credentials.access_key_id`      | `None`               | Your AWS access key to use your S3 bucket to store log files of training models or access tensorflow graphs used in `NerDLApproach`                                                                                                                                                |
-| `spark.jsl.settings.aws.credentials.secret_access_key`  | `None`               | Your AWS secret access key to use your S3 bucket to store log files of training models or access tensorflow graphs used in `NerDLApproach`                                                                                                                                         |
-| `spark.jsl.settings.aws.credentials.session_token`      | `None`               | Your AWS MFA session token to use your S3 bucket to store log files of training models or access tensorflow graphs used in `NerDLApproach`                                                                                                                                         |
-| `spark.jsl.settings.aws.s3_bucket`                      | `None`               | Your AWS S3 bucket to store log files of training models or access tensorflow graphs used in `NerDLApproach`                                                                                                                                                                       |
-| `spark.jsl.settings.aws.region`                         | `None`               | Your AWS region to use your S3 bucket to store log files of training models or access tensorflow graphs used in `NerDLApproach`                                                                                                                                                    |
-| `spark.jsl.settings.onnx.gpuDeviceId`                   | `0`                  | Constructs CUDA execution provider options for the specified non-negative device id.                                                                                                                                                                                               |
-| `spark.jsl.settings.onnx.intraOpNumThreads`             | `6`                  | Sets the size of the CPU thread pool used for executing a single graph, if executing on a CPU.                                                                                                                                                                                     |
-| `spark.jsl.settings.onnx.optimizationLevel`             | `ALL_OPT`            | Sets the optimization level of this options object, overriding the old setting.                                                                                                                                                                                                    |
-| `spark.jsl.settings.onnx.executionMode`                 | `SEQUENTIAL`         | Sets the execution mode of this options object, overriding the old setting.                                                                                                                                                                                                        |
+## Platform-Specific Instructions
+For detailed instructions on how to use Spark NLP on supported platforms, please refer to our official documentation:
 
-### How to set Spark NLP Configuration
+| Platform                | Supported Language(s) |
+|-------------------------|-----------------------|
+| [Apache Zeppelin](https://sparknlp.org/docs/en/install#apache-zeppelin)      | Scala, Python         |
+| [Jupyter Notebook](https://sparknlp.org/docs/en/install#jupter-notebook) | Python                |
+| [Google Colab Notebook](https://sparknlp.org/docs/en/install#google-colab-notebook) | Python                |
+| [Kaggle Kernel](https://sparknlp.org/docs/en/install#kaggle-kernel)        | Python                |
+| [Databricks Cluster](https://sparknlp.org/docs/en/install#databricks-cluster)    | Scala, Python         |
+| [EMR Cluster](https://sparknlp.org/docs/en/install#emr-cluster)           | Scala, Python         |
+| [GCP Dataproc Cluster](https://sparknlp.org/docs/en/install#gcp-dataproc) | Scala, Python         |
 
-**SparkSession:**
-
-You can use `.config()` during SparkSession creation to set Spark NLP configurations.
-
-```python
-from pyspark.sql import SparkSession
-
-spark = SparkSession.builder
-    .master("local[*]")
-    .config("spark.driver.memory", "16G")
-    .config("spark.driver.maxResultSize", "0")
-    .config("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
-    .config("spark.kryoserializer.buffer.max", "2000m")
-    .config("spark.jsl.settings.pretrained.cache_folder", "sample_data/pretrained")
-    .config("spark.jsl.settings.storage.cluster_tmp_dir", "sample_data/storage")
-    .config("spark.jars.packages", "com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0")
-    .getOrCreate()
-```
-
-**spark-shell:**
-
-```sh
-spark-shell \
-  --driver-memory 16g \
-  --conf spark.driver.maxResultSize=0 \
-  --conf spark.serializer=org.apache.spark.serializer.KryoSerializer
-  --conf spark.kryoserializer.buffer.max=2000M \
-  --conf spark.jsl.settings.pretrained.cache_folder="sample_data/pretrained" \
-  --conf spark.jsl.settings.storage.cluster_tmp_dir="sample_data/storage" \
-  --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0
-```
 
-**pyspark:**
+### Offline
 
-```sh
-pyspark \
-  --driver-memory 16g \
-  --conf spark.driver.maxResultSize=0 \
-  --conf spark.serializer=org.apache.spark.serializer.KryoSerializer
-  --conf spark.kryoserializer.buffer.max=2000M \
-  --conf spark.jsl.settings.pretrained.cache_folder="sample_data/pretrained" \
-  --conf spark.jsl.settings.storage.cluster_tmp_dir="sample_data/storage" \
-  --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0
-```
-
-**Databricks:**
-
-On a new cluster or existing one you need to add the following to the `Advanced Options -> Spark` tab:
+Spark NLP library and all the pre-trained models/pipelines can be used entirely offline with no access to the Internet.
+Please check [these instructions](https://sparknlp.org/docs/en/install#s3-integration) from our official documentation
+to use Spark NLP offline
 
-```bash
-spark.kryoserializer.buffer.max 2000M
-spark.serializer org.apache.spark.serializer.KryoSerializer
-spark.jsl.settings.pretrained.cache_folder dbfs:/PATH_TO_CACHE
-spark.jsl.settings.storage.cluster_tmp_dir dbfs:/PATH_TO_STORAGE
-spark.jsl.settings.annotator.log_folder dbfs:/PATH_TO_LOGS
-```
+## Advanced Settings
 
-NOTE: If this is an existing cluster, after adding new configs or changing existing properties you need to restart it.
+You can change Spark NLP configurations via Spark properties configuration. 
+Please check [these instructions](https://sparknlp.org/docs/en/install#sparknlp-properties) from our official documentation.
 
 ### S3 Integration
 
@@ -991,302 +225,24 @@ In Spark NLP we can define S3 locations to:
 - Export log files of training models
 - Store tensorflow graphs used in `NerDLApproach`
 
-**Logging:**
-
-To configure S3 path for logging while training models. We need to set up AWS credentials as well as an S3 path
-
-```bash
-spark.conf.set("spark.jsl.settings.annotator.log_folder", "s3://my/s3/path/logs")
-spark.conf.set("spark.jsl.settings.aws.credentials.access_key_id", "MY_KEY_ID")
-spark.conf.set("spark.jsl.settings.aws.credentials.secret_access_key", "MY_SECRET_ACCESS_KEY")
-spark.conf.set("spark.jsl.settings.aws.s3_bucket", "my.bucket")
-spark.conf.set("spark.jsl.settings.aws.region", "my-region")
-```
-
-Now you can check the log on your S3 path defined in *spark.jsl.settings.annotator.log_folder* property.
-Make sure to use the prefix *s3://*, otherwise it will use the default configuration.
-
-**Tensorflow Graphs:**
-
-To reference S3 location for downloading graphs. We need to set up AWS credentials
-
-```bash
-spark.conf.set("spark.jsl.settings.aws.credentials.access_key_id", "MY_KEY_ID")
-spark.conf.set("spark.jsl.settings.aws.credentials.secret_access_key", "MY_SECRET_ACCESS_KEY")
-spark.conf.set("spark.jsl.settings.aws.region", "my-region")
-```
-
-**MFA Configuration:**
-
-In case your AWS account is configured with MFA. You will need first to get temporal credentials and add session token
-to the configuration as shown in the examples below
-For logging:
-
-```bash
-spark.conf.set("spark.jsl.settings.aws.credentials.session_token", "MY_TOKEN")
-```
-
-An example of a bash script that gets temporal AWS credentials can be
-found [here](https://github.com/JohnSnowLabs/spark-nlp/blob/master/scripts/aws_tmp_credentials.sh)
-This script requires three arguments:
-
-```bash
-./aws_tmp_credentials.sh iam_user duration serial_number
-```
-
-## Pipelines and Models
-
-### Pipelines
-
-**Quick example:**
-
-```scala
-import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
-import com.johnsnowlabs.nlp.SparkNLP
-
-SparkNLP.version()
-
-val testData = spark.createDataFrame(Seq(
-  (1, "Google has announced the release of a beta version of the popular TensorFlow machine learning library"),
-  (2, "Donald John Trump (born June 14, 1946) is the 45th and current president of the United States")
-)).toDF("id", "text")
-
-val pipeline = PretrainedPipeline("explain_document_dl", lang = "en")
-
-val annotation = pipeline.transform(testData)
-
-annotation.show()
-/*
-import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
-import com.johnsnowlabs.nlp.SparkNLP
-2.5.0
-testData: org.apache.spark.sql.DataFrame = [id: int, text: string]
-pipeline: com.johnsnowlabs.nlp.pretrained.PretrainedPipeline = PretrainedPipeline(explain_document_dl,en,public/models)
-annotation: org.apache.spark.sql.DataFrame = [id: int, text: string ... 10 more fields]
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
-| id|                text|            document|               token|            sentence|             checked|               lemma|                stem|                 pos|          embeddings|                 ner|            entities|
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
-|  1|Google has announ...|[[document, 0, 10...|[[token, 0, 5, Go...|[[document, 0, 10...|[[token, 0, 5, Go...|[[token, 0, 5, Go...|[[token, 0, 5, go...|[[pos, 0, 5, NNP,...|[[word_embeddings...|[[named_entity, 0...|[[chunk, 0, 5, Go...|
-|  2|The Paris metro w...|[[document, 0, 11...|[[token, 0, 2, Th...|[[document, 0, 11...|[[token, 0, 2, Th...|[[token, 0, 2, Th...|[[token, 0, 2, th...|[[pos, 0, 2, DT, ...|[[word_embeddings...|[[named_entity, 0...|[[chunk, 4, 8, Pa...|
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
-*/
-
-annotation.select("entities.result").show(false)
-
-/*
-+----------------------------------+
-|result                            |
-+----------------------------------+
-|[Google, TensorFlow]              |
-|[Donald John Trump, United States]|
-+----------------------------------+
-*/
-```
-
-#### Showing Available Pipelines
-
-There are functions in Spark NLP that will list all the available Pipelines
-of a particular language for you:
-
-```scala
-import com.johnsnowlabs.nlp.pretrained.ResourceDownloader
-
-ResourceDownloader.showPublicPipelines(lang = "en")
-/*
-+--------------------------------------------+------+---------+
-| Pipeline                                   | lang | version |
-+--------------------------------------------+------+---------+
-| dependency_parse                           |  en  | 2.0.2   |
-| analyze_sentiment_ml                       |  en  | 2.0.2   |
-| check_spelling                             |  en  | 2.1.0   |
-| match_datetime                             |  en  | 2.1.0   |
-                               ...
-| explain_document_ml                        |  en  | 3.1.3   |
-+--------------------------------------------+------+---------+
-*/
-```
-
-Or if we want to check for a particular version:
-
-```scala
-import com.johnsnowlabs.nlp.pretrained.ResourceDownloader
-
-ResourceDownloader.showPublicPipelines(lang = "en", version = "3.1.0")
-/*
-+---------------------------------------+------+---------+
-| Pipeline                              | lang | version |
-+---------------------------------------+------+---------+
-| dependency_parse                      |  en  | 2.0.2   |
-                               ...
-| clean_slang                           |  en  | 3.0.0   |
-| clean_pattern                         |  en  | 3.0.0   |
-| check_spelling                        |  en  | 3.0.0   |
-| dependency_parse                      |  en  | 3.0.0   |
-+---------------------------------------+------+---------+
-*/
-```
-
-#### Please check out our Models Hub for the full list of [pre-trained pipelines](https://sparknlp.org/models) with examples, demos, benchmarks, and more
-
-### Models
+Please check [these instructions](https://sparknlp.org/docs/en/install#s3-integration) from our official documentation.
 
-**Some selected languages:
-** `Afrikaans, Arabic, Armenian, Basque, Bengali, Breton, Bulgarian, Catalan, Czech, Dutch, English, Esperanto, Finnish, French, Galician, German, Greek, Hausa, Hebrew, Hindi, Hungarian, Indonesian, Irish, Italian, Japanese, Latin, Latvian, Marathi, Norwegian, Persian, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Somali, Southern Sotho, Spanish, Swahili, Swedish, Tswana, Turkish, Ukrainian, Zulu`
+## Documentation
 
-**Quick online example:**
-
-```python
-# load NER model trained by deep learning approach and GloVe word embeddings
-ner_dl = NerDLModel.pretrained('ner_dl')
-# load NER model trained by deep learning approach and BERT word embeddings
-ner_bert = NerDLModel.pretrained('ner_dl_bert')
-```
-
-```scala
-// load French POS tagger model trained by Universal Dependencies
-val french_pos = PerceptronModel.pretrained("pos_ud_gsd", lang = "fr")
-// load Italian LemmatizerModel
-val italian_lemma = LemmatizerModel.pretrained("lemma_dxc", lang = "it")
-````
-
-**Quick offline example:**
-
-- Loading `PerceptronModel` annotator model inside Spark NLP Pipeline
-
-```scala
-val french_pos = PerceptronModel.load("/tmp/pos_ud_gsd_fr_2.0.2_2.4_1556531457346/")
-  .setInputCols("document", "token")
-  .setOutputCol("pos")
-```
-
-#### Showing Available Models
-
-There are functions in Spark NLP that will list all the available Models
-of a particular Annotator and language for you:
-
-```scala
-import com.johnsnowlabs.nlp.pretrained.ResourceDownloader
-
-ResourceDownloader.showPublicModels(annotator = "NerDLModel", lang = "en")
-/*
-+---------------------------------------------+------+---------+
-| Model                                       | lang | version |
-+---------------------------------------------+------+---------+
-| onto_100                                    |  en  | 2.1.0   |
-| onto_300                                    |  en  | 2.1.0   |
-| ner_dl_bert                                 |  en  | 2.2.0   |
-| onto_100                                    |  en  | 2.4.0   |
-| ner_conll_elmo                              |  en  | 3.2.2   |
-+---------------------------------------------+------+---------+
-*/
-```
-
-Or if we want to check for a particular version:
-
-```scala
-import com.johnsnowlabs.nlp.pretrained.ResourceDownloader
-
-ResourceDownloader.showPublicModels(annotator = "NerDLModel", lang = "en", version = "3.1.0")
-/*
-+----------------------------+------+---------+
-| Model                      | lang | version |
-+----------------------------+------+---------+
-| onto_100                   |  en  | 2.1.0   |
-| ner_aspect_based_sentiment |  en  | 2.6.2   |
-| ner_weibo_glove_840B_300d  |  en  | 2.6.2   |
-| nerdl_atis_840b_300d       |  en  | 2.7.1   |
-| nerdl_snips_100d           |  en  | 2.7.3   |
-+----------------------------+------+---------+
-*/
-```
-
-And to see a list of available annotators, you can use:
-
-```scala
-import com.johnsnowlabs.nlp.pretrained.ResourceDownloader
-
-ResourceDownloader.showAvailableAnnotators()
-/*
-AlbertEmbeddings
-AlbertForTokenClassification
-AssertionDLModel
-...
-XlmRoBertaSentenceEmbeddings
-XlnetEmbeddings
-*/
-```
-
-#### Please check out our Models Hub for the full list of [pre-trained models](https://sparknlp.org/models) with examples, demo, benchmark, and more
-
-## Offline
-
-Spark NLP library and all the pre-trained models/pipelines can be used entirely offline with no access to the Internet.
-If you are behind a proxy or a firewall with no access to the Maven repository (to download packages) or/and no access
-to S3 (to automatically download models and pipelines), you can simply follow the instructions to have Spark NLP without
-any limitations offline:
-
-- Instead of using the Maven package, you need to load our Fat JAR
-- Instead of using PretrainedPipeline for pretrained pipelines or the `.pretrained()` function to download pretrained
-  models, you will need to manually download your pipeline/model from [Models Hub](https://sparknlp.org/models),
-  extract it, and load it.
-
-Example of `SparkSession` with Fat JAR to have Spark NLP offline:
-
-```python
-spark = SparkSession.builder
-    .appName("Spark NLP")
-    .master("local[*]")
-    .config("spark.driver.memory", "16G")
-    .config("spark.driver.maxResultSize", "0")
-    .config("spark.kryoserializer.buffer.max", "2000M")
-    .config("spark.jars", "/tmp/spark-nlp-assembly-5.4.0.jar")
-    .getOrCreate()
-```
-
-- You can download provided Fat JARs from each [release notes](https://github.com/JohnSnowLabs/spark-nlp/releases),
-  please pay attention to pick the one that suits your environment depending on the device (CPU/GPU) and Apache Spark
-  version (3.0.x, 3.1.x, 3.2.x, 3.3.x, 3.4.x, and 3.5.x)
-- If you are local, you can load the Fat JAR from your local FileSystem, however, if you are in a cluster setup you need
-  to put the Fat JAR on a distributed FileSystem such as HDFS, DBFS, S3, etc. (
-  i.e., `hdfs:///tmp/spark-nlp-assembly-5.4.0.jar`)
-
-Example of using pretrained Models and Pipelines in offline:
-
-```python
-# instead of using pretrained() for online:
-# french_pos = PerceptronModel.pretrained("pos_ud_gsd", lang="fr")
-# you download this model, extract it, and use .load
-french_pos = PerceptronModel.load("/tmp/pos_ud_gsd_fr_2.0.2_2.4_1556531457346/")
-    .setInputCols("document", "token")
-    .setOutputCol("pos")
-
-# example for pipelines
-# instead of using PretrainedPipeline
-# pipeline = PretrainedPipeline('explain_document_dl', lang='en')
-# you download this pipeline, extract it, and use PipelineModel
-PipelineModel.load("/tmp/explain_document_dl_en_2.0.2_2.4_1556530585689/")
-```
-
-- Since you are downloading and loading models/pipelines manually, this means Spark NLP is not downloading the most
-  recent and compatible models/pipelines for you. Choosing the right model/pipeline is on you
-- If you are local, you can load the model/pipeline from your local FileSystem, however, if you are in a cluster setup
-  you need to put the model/pipeline on a distributed FileSystem such as HDFS, DBFS, S3, etc. (
-  i.e., `hdfs:///tmp/explain_document_dl_en_2.0.2_2.4_1556530585689/`)
-
-## Examples
+### Examples
 
 Need more **examples**? Check out our dedicated [Spark NLP Examples](https://github.com/JohnSnowLabs/spark-nlp/tree/master/examples)
 repository to showcase all Spark NLP use cases!
 
 Also, don't forget to check [Spark NLP in Action](https://sparknlp.org/demo) built by Streamlit.
 
-### All examples: [spark-nlp/examples](https://github.com/JohnSnowLabs/spark-nlp/tree/master/examples)
+#### All examples: [spark-nlp/examples](https://github.com/JohnSnowLabs/spark-nlp/tree/master/examples)
 
-## FAQ
+### FAQ
 
 [Check our Articles and Videos page here](https://sparknlp.org/learn)
 
-## Citation
+### Citation
 
 We have published a [paper](https://www.sciencedirect.com/science/article/pii/S2665963821000063) that you can cite for
 the Spark NLP library:
@@ -1307,6 +263,15 @@ the Spark NLP library:
 }
 ```
 
+## Community support
+
+- [Slack](https://join.slack.com/t/spark-nlp/shared_invite/zt-198dipu77-L3UWNe_AJ8xqDk0ivmih5Q) For live discussion with the Spark NLP community and the team
+- [GitHub](https://github.com/JohnSnowLabs/spark-nlp) Bug reports, feature requests, and contributions
+- [Discussions](https://github.com/JohnSnowLabs/spark-nlp/discussions) Engage with other community members, share ideas,
+  and show off how you use Spark NLP!
+- [Medium](https://medium.com/spark-nlp) Spark NLP articles
+- [YouTube](https://www.youtube.com/channel/UCmFOjlpYEhxf_wJUDuz6xxQ/videos) Spark NLP video tutorials
+
 ## Contributing
 
 We appreciate any sort of contributions:
diff --git a/docs/_data/navigation.yml b/docs/_data/navigation.yml
index 21b4f372614dd6..c6e75a2a846237 100755
--- a/docs/_data/navigation.yml
+++ b/docs/_data/navigation.yml
@@ -36,6 +36,12 @@ sparknlp:
         url: /docs/en/quickstart
       - title: Install Spark NLP
         url: /docs/en/install
+      - title: Advanced Settings
+        url: /docs/en/advanced_settings
+      - title: Features
+        url: /docs/en/features
+      - title: Pipelines and Models
+        url: /docs/en/pipelines
       - title: General Concepts
         url: /docs/en/concepts
       - title: Annotators
diff --git a/docs/en/advanced_settings.md b/docs/en/advanced_settings.md
new file mode 100644
index 00000000000000..84c8dc5751187e
--- /dev/null
+++ b/docs/en/advanced_settings.md
@@ -0,0 +1,142 @@
+---
+layout: docs
+header: true
+seotitle: Spark NLP - Advanced Settings
+title: Spark NLP - Advanced Settings
+permalink: /docs/en/advanced_settings
+key: docs-install
+modify_date: "2024-07-04"
+show_nav: true
+sidebar:
+    nav: sparknlp
+---
+
+<div class="h3-box" markdown="1">
+
+## SparkNLP Properties
+
+You can change the following Spark NLP configurations via Spark Configuration:
+
+| Property Name                                           | Default              | Meaning                                                                                                                                                                                                                                                                            |
+|---------------------------------------------------------|----------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| `spark.jsl.settings.pretrained.cache_folder`            | `~/cache_pretrained` | The location to download and extract pretrained `Models` and `Pipelines`. By default, it will be in User's Home directory under `cache_pretrained` directory                                                                                                                       |
+| `spark.jsl.settings.storage.cluster_tmp_dir`            | `hadoop.tmp.dir`     | The location to use on a cluster for temporarily files such as unpacking indexes for WordEmbeddings. By default, this locations is the location of `hadoop.tmp.dir` set via Hadoop configuration for Apache Spark. NOTE: `S3` is not supported and it must be local, HDFS, or DBFS |
+| `spark.jsl.settings.annotator.log_folder`               | `~/annotator_logs`   | The location to save logs from annotators during training such as `NerDLApproach`, `ClassifierDLApproach`, `SentimentDLApproach`, `MultiClassifierDLApproach`, etc. By default, it will be in User's Home directory under `annotator_logs` directory                               |
+| `spark.jsl.settings.aws.credentials.access_key_id`      | `None`               | Your AWS access key to use your S3 bucket to store log files of training models or access tensorflow graphs used in `NerDLApproach`                                                                                                                                                |
+| `spark.jsl.settings.aws.credentials.secret_access_key`  | `None`               | Your AWS secret access key to use your S3 bucket to store log files of training models or access tensorflow graphs used in `NerDLApproach`                                                                                                                                         |
+| `spark.jsl.settings.aws.credentials.session_token`      | `None`               | Your AWS MFA session token to use your S3 bucket to store log files of training models or access tensorflow graphs used in `NerDLApproach`                                                                                                                                         |
+| `spark.jsl.settings.aws.s3_bucket`                      | `None`               | Your AWS S3 bucket to store log files of training models or access tensorflow graphs used in `NerDLApproach`                                                                                                                                                                       |
+| `spark.jsl.settings.aws.region`                         | `None`               | Your AWS region to use your S3 bucket to store log files of training models or access tensorflow graphs used in `NerDLApproach`                                                                                                                                                    |
+| `spark.jsl.settings.onnx.gpuDeviceId`                   | `0`                  | Constructs CUDA execution provider options for the specified non-negative device id.                                                                                                                                                                                               |
+| `spark.jsl.settings.onnx.intraOpNumThreads`             | `6`                  | Sets the size of the CPU thread pool used for executing a single graph, if executing on a CPU.                                                                                                                                                                                     |
+| `spark.jsl.settings.onnx.optimizationLevel`             | `ALL_OPT`            | Sets the optimization level of this options object, overriding the old setting.                                                                                                                                                                                                    |
+| `spark.jsl.settings.onnx.executionMode`                 | `SEQUENTIAL`         | Sets the execution mode of this options object, overriding the old setting.                                                                                                                                                                                                        |
+
+### How to set Spark NLP Configuration
+
+**SparkSession:**
+
+You can use `.config()` during SparkSession creation to set Spark NLP configurations.
+
+```python
+from pyspark.sql import SparkSession
+
+spark = SparkSession.builder
+    .master("local[*]")
+    .config("spark.driver.memory", "16G")
+    .config("spark.driver.maxResultSize", "0")
+    .config("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
+    .config("spark.kryoserializer.buffer.max", "2000m")
+    .config("spark.jsl.settings.pretrained.cache_folder", "sample_data/pretrained")
+    .config("spark.jsl.settings.storage.cluster_tmp_dir", "sample_data/storage")
+    .config("spark.jars.packages", "com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0")
+    .getOrCreate()
+```
+
+**spark-shell:**
+
+```sh
+spark-shell \
+  --driver-memory 16g \
+  --conf spark.driver.maxResultSize=0 \
+  --conf spark.serializer=org.apache.spark.serializer.KryoSerializer
+  --conf spark.kryoserializer.buffer.max=2000M \
+  --conf spark.jsl.settings.pretrained.cache_folder="sample_data/pretrained" \
+  --conf spark.jsl.settings.storage.cluster_tmp_dir="sample_data/storage" \
+  --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0
+```
+
+**pyspark:**
+
+```sh
+pyspark \
+  --driver-memory 16g \
+  --conf spark.driver.maxResultSize=0 \
+  --conf spark.serializer=org.apache.spark.serializer.KryoSerializer
+  --conf spark.kryoserializer.buffer.max=2000M \
+  --conf spark.jsl.settings.pretrained.cache_folder="sample_data/pretrained" \
+  --conf spark.jsl.settings.storage.cluster_tmp_dir="sample_data/storage" \
+  --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0
+```
+
+**Databricks:**
+
+On a new cluster or existing one you need to add the following to the `Advanced Options -> Spark` tab:
+
+```bash
+spark.kryoserializer.buffer.max 2000M
+spark.serializer org.apache.spark.serializer.KryoSerializer
+spark.jsl.settings.pretrained.cache_folder dbfs:/PATH_TO_CACHE
+spark.jsl.settings.storage.cluster_tmp_dir dbfs:/PATH_TO_STORAGE
+spark.jsl.settings.annotator.log_folder dbfs:/PATH_TO_LOGS
+```
+
+NOTE: If this is an existing cluster, after adding new configs or changing existing properties you need to restart it.
+
+
+### S3 Integration
+
+**Logging:**
+
+To configure S3 path for logging while training models. We need to set up AWS credentials as well as an S3 path
+
+```bash
+spark.conf.set("spark.jsl.settings.annotator.log_folder", "s3://my/s3/path/logs")
+spark.conf.set("spark.jsl.settings.aws.credentials.access_key_id", "MY_KEY_ID")
+spark.conf.set("spark.jsl.settings.aws.credentials.secret_access_key", "MY_SECRET_ACCESS_KEY")
+spark.conf.set("spark.jsl.settings.aws.s3_bucket", "my.bucket")
+spark.conf.set("spark.jsl.settings.aws.region", "my-region")
+```
+
+Now you can check the log on your S3 path defined in *spark.jsl.settings.annotator.log_folder* property.
+Make sure to use the prefix *s3://*, otherwise it will use the default configuration.
+
+**Tensorflow Graphs:**
+
+To reference S3 location for downloading graphs. We need to set up AWS credentials
+
+```bash
+spark.conf.set("spark.jsl.settings.aws.credentials.access_key_id", "MY_KEY_ID")
+spark.conf.set("spark.jsl.settings.aws.credentials.secret_access_key", "MY_SECRET_ACCESS_KEY")
+spark.conf.set("spark.jsl.settings.aws.region", "my-region")
+```
+
+**MFA Configuration:**
+
+In case your AWS account is configured with MFA. You will need first to get temporal credentials and add session token
+to the configuration as shown in the examples below
+For logging:
+
+```bash
+spark.conf.set("spark.jsl.settings.aws.credentials.session_token", "MY_TOKEN")
+```
+
+An example of a bash script that gets temporal AWS credentials can be
+found [here](https://github.com/JohnSnowLabs/spark-nlp/blob/master/scripts/aws_tmp_credentials.sh)
+This script requires three arguments:
+
+```bash
+./aws_tmp_credentials.sh iam_user duration serial_number
+```
+
+</div>
\ No newline at end of file
diff --git a/docs/en/features.md b/docs/en/features.md
new file mode 100644
index 00000000000000..1a9a5b80470828
--- /dev/null
+++ b/docs/en/features.md
@@ -0,0 +1,120 @@
+---
+layout: docs
+header: true
+seotitle: Spark NLP - Features
+title: Spark NLP - Features
+permalink: /docs/en/features
+key: docs-install
+modify_date: "2024-07-03"
+show_nav: true
+sidebar:
+    nav: sparknlp
+---
+
+
+<div class="h3-box" markdown="1">
+
+## Text Preprocessing
+- Tokenization
+- Trainable Word Segmentation
+- Stop Words Removal
+- Token Normalizer
+- Document Normalizer
+- Document & Text Splitter
+- Stemmer
+- Lemmatizer
+- NGrams
+- Regex Matching
+- Text Matching
+- Spell Checker (ML and DL models)
+
+## Parsing and Analysis
+- Chunking
+- Date Matcher
+- Sentence Detector
+- Deep Sentence Detector (Deep learning)
+- Dependency parsing (Labeled/unlabeled)
+- SpanBertCorefModel (Coreference Resolution)
+- Part-of-speech tagging
+- Named entity recognition (Deep learning)
+- Unsupervised keywords extraction
+- Language Detection & Identification (up to 375 languages)
+
+## Sentiment and Classification
+- Sentiment Detection (ML models)
+- Multi-class & Multi-label Sentiment analysis (Deep learning)
+- Multi-class Text Classification (Deep learning)
+- Zero-Shot NER Model
+- Zero-Shot Text Classification by Transformers (ZSL)
+
+## Embeddings
+- Word Embeddings (GloVe and Word2Vec)
+- Doc2Vec (based on Word2Vec)
+- BERT Embeddings (TF Hub & HuggingFace models)
+- DistilBERT Embeddings (HuggingFace models)
+- CamemBERT Embeddings (HuggingFace models)
+- RoBERTa Embeddings (HuggingFace models)
+- DeBERTa Embeddings (HuggingFace v2 & v3 models)
+- XLM-RoBERTa Embeddings (HuggingFace models)
+- Longformer Embeddings (HuggingFace models)
+- ALBERT Embeddings (TF Hub & HuggingFace models)
+- XLNet Embeddings
+- ELMO Embeddings (TF Hub models)
+- Universal Sentence Encoder (TF Hub models)
+- BERT Sentence Embeddings (TF Hub & HuggingFace models)
+- RoBerta Sentence Embeddings (HuggingFace models)
+- XLM-RoBerta Sentence Embeddings (HuggingFace models)
+- INSTRUCTOR Embeddings (HuggingFace models)
+- E5 Embeddings (HuggingFace models)
+- MPNet Embeddings (HuggingFace models)
+- UAE Embeddings (HuggingFace models)
+- OpenAI Embeddings
+- Sentence & Chunk Embeddings
+
+## Classification and Question Answering Models
+- BERT for Token & Sequence Classification & Question Answering
+- DistilBERT for Token & Sequence Classification & Question Answering
+- CamemBERT for Token & Sequence Classification & Question Answering
+- ALBERT for Token & Sequence Classification & Question Answering
+- RoBERTa for Token & Sequence Classification & Question Answering
+- DeBERTa for Token & Sequence Classification & Question Answering
+- XLM-RoBERTa for Token & Sequence Classification & Question Answering
+- Longformer for Token & Sequence Classification & Question Answering
+- MPnet for Token & Sequence Classification & Question Answering
+- XLNet for Token & Sequence Classification
+
+## Machine Translation and Generation
+- Neural Machine Translation (MarianMT)
+- Many-to-Many multilingual translation model (Facebook M2M100)
+- Table Question Answering (TAPAS)
+- Text-To-Text Transfer Transformer (Google T5)
+- Generative Pre-trained Transformer 2 (OpenAI GPT2)
+- Seq2Seq for NLG, Translation, and Comprehension (Facebook BART)
+- Chat and Conversational LLMs (Facebook Llama-2)
+
+## Image and Speech
+- Vision Transformer (Google ViT)
+- Swin Image Classification (Microsoft Swin Transformer)
+- ConvNext Image Classification (Facebook ConvNext)
+- Vision Encoder Decoder for image-to-text like captioning
+- Zero-Shot Image Classification by OpenAI's CLIP
+- Automatic Speech Recognition (Wav2Vec2)
+- Automatic Speech Recognition (HuBERT)
+- Automatic Speech Recognition (OpenAI Whisper)
+
+## Integration and Interoperability
+- Easy [ONNX](https://github.com/JohnSnowLabs/spark-nlp/tree/feature/SPARKNLP-1015-Modernizing-GitHub-repo/examples/python/transformers/onnx), [OpenVINO](https://github.com/JohnSnowLabs/spark-nlp/tree/feature/SPARKNLP-1015-Modernizing-GitHub-repo/examples/python/transformers/openvino), and [TensorFlow](https://github.com/JohnSnowLabs/spark-nlp/tree/feature/SPARKNLP-1015-Modernizing-GitHub-repo/examples/python/transformers) integrations
+- Full integration with Spark ML functions
+- GPU Support
+
+## Pre-trained Models
+- +31000 pre-trained models in +200 languages!
+- +6000 pre-trained pipelines in +200 languages!
+
+#### Please check out our Models Hub for the full list of [pre-trained models](https://sparknlp.org/models) with examples, demo, benchmark, and more
+
+## Multi-lingual Support
+- Multi-lingual NER models: Arabic, Bengali, Chinese, Danish, Dutch, English, Finnish, French, German, Hebrew, Italian,
+  Japanese, Korean, Norwegian, Persian, Polish, Portuguese, Russian, Spanish, Swedish, Urdu, and more.
+
+</div>
\ No newline at end of file
diff --git a/docs/en/install.md b/docs/en/install.md
index 4bc861a2c0d496..3d32683830df96 100644
--- a/docs/en/install.md
+++ b/docs/en/install.md
@@ -5,7 +5,7 @@ seotitle: Spark NLP - Installation
 title: Spark NLP - Installation
 permalink: /docs/en/install
 key: docs-install
-modify_date: "2023-05-10"
+modify_date: "2024-07-04"
 show_nav: true
 sidebar:
     nav: sparknlp
@@ -35,6 +35,14 @@ spark-submit --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0
 spark-shell --jars spark-nlp-assembly-5.4.0.jar
 ```
 
+**GPU (optional):**
+
+Spark NLP 5.4.0 is built with ONNX 1.17.0 and TensorFlow 2.7.1 deep learning engines. The minimum following NVIDIA® software are only required for GPU support:
+
+- NVIDIA® GPU drivers version 450.80.02 or higher
+- CUDA® Toolkit 11.2
+- cuDNN SDK 8.1.0
+
 </div><div class="h3-box" markdown="1">
 
 ## Python
@@ -95,15 +103,73 @@ spark = SparkSession.builder \
     .config("spark.jars.packages", "com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0") \
     .getOrCreate()
 ```
+If using local jars, you can use `spark.jars` instead for comma-delimited jar files. For cluster setups, of course,
+you'll have to put the jars in a reachable location for all driver and executor nodes.
+
+### Python without explicit Pyspark installation
+
+### Pip/Conda
+
+If you installed pyspark through pip/conda, you can install `spark-nlp` through the same channel.
+
+Pip:
+
+```bash
+pip install spark-nlp==5.4.0
+```
+
+Conda:
+
+```bash
+conda install -c johnsnowlabs spark-nlp
+```
+
+PyPI [spark-nlp package](https://pypi.org/project/spark-nlp/) /
+Anaconda [spark-nlp package](https://anaconda.org/JohnSnowLabs/spark-nlp)
+
+Then you'll have to create a SparkSession either from Spark NLP:
+
+```python
+import sparknlp
+
+spark = sparknlp.start()
+```
+
+**Quick example:**
+
+```python
+import sparknlp
+from sparknlp.pretrained import PretrainedPipeline
+
+# create or get Spark Session
+
+spark = sparknlp.start()
+
+sparknlp.version()
+spark.version
+
+# download, load and annotate a text by pre-trained pipeline
+
+pipeline = PretrainedPipeline('recognize_entities_dl', 'en')
+result = pipeline.annotate('The Mona Lisa is a 16th century oil painting created by Leonardo')
+```
 
 </div><div class="h3-box" markdown="1">
 
 ## Scala and Java
 
+To use Spark NLP you need the following requirements:
+
+- Java 8 and 11
+- Apache Spark 3.5.x, 3.4.x, 3.3.x, 3.2.x, 3.1.x, 3.0.x
+
 #### Maven
 
 **spark-nlp** on Apache Spark 3.0.x, 3.1.x, 3.2.x, 3.3.x, and 3.4.x
 
+The `spark-nlp` has been published to
+the [Maven Repository](https://mvnrepository.com/artifact/com.johnsnowlabs.nlp/spark-nlp).
+
 ```xml
 <!-- https://mvnrepository.com/artifact/com.johnsnowlabs.nlp/spark-nlp -->
 <dependency>
@@ -240,6 +306,81 @@ as expected.
 
 </div><div class="h3-box" markdown="1">
 
+
+## Command line
+
+Spark NLP supports all major releases of Apache Spark 3.0.x, Apache Spark 3.1.x, Apache Spark 3.2.x, Apache Spark 3.3.x, Apache Spark 3.4.x, and Apache Spark 3.5.x
+This steps require internet connection.
+
+#### Apache Spark 3.x (3.0.x, 3.1.x, 3.2.x, 3.3.x, 3.4.x, and 3.5.x - Scala 2.12)
+
+```sh
+# CPU
+
+spark-shell --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0
+
+pyspark --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0
+
+spark-submit --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0
+```
+
+The `spark-nlp` has been published to
+the [Maven Repository](https://mvnrepository.com/artifact/com.johnsnowlabs.nlp/spark-nlp).
+
+```sh
+# GPU
+
+spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-gpu_2.12:5.4.0
+
+pyspark --packages com.johnsnowlabs.nlp:spark-nlp-gpu_2.12:5.4.0
+
+spark-submit --packages com.johnsnowlabs.nlp:spark-nlp-gpu_2.12:5.4.0
+
+```
+
+The `spark-nlp-gpu` has been published to
+the [Maven Repository](https://mvnrepository.com/artifact/com.johnsnowlabs.nlp/spark-nlp-gpu).
+
+```sh
+# AArch64
+
+spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-aarch64_2.12:5.4.0
+
+pyspark --packages com.johnsnowlabs.nlp:spark-nlp-aarch64_2.12:5.4.0
+
+spark-submit --packages com.johnsnowlabs.nlp:spark-nlp-aarch64_2.12:5.4.0
+
+```
+
+The `spark-nlp-aarch64` has been published to
+the [Maven Repository](https://mvnrepository.com/artifact/com.johnsnowlabs.nlp/spark-nlp-aarch64).
+
+```sh
+# M1/M2 (Apple Silicon)
+
+spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-silicon_2.12:5.4.0
+
+pyspark --packages com.johnsnowlabs.nlp:spark-nlp-silicon_2.12:5.4.0
+
+spark-submit --packages com.johnsnowlabs.nlp:spark-nlp-silicon_2.12:5.4.0
+
+```
+
+The `spark-nlp-silicon` has been published to
+the [Maven Repository](https://mvnrepository.com/artifact/com.johnsnowlabs.nlp/spark-nlp-silicon).
+
+**NOTE**: In case you are using large pretrained models like UniversalSentenceEncoder, you need to have the following
+set in your SparkSession:
+
+```sh
+spark-shell \
+  --driver-memory 16g \
+  --conf spark.kryoserializer.buffer.max=2000M \
+  --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0
+```
+
+## Installation for M1 & M2 Chips
+
 ### Scala and Java for M1
 
 Adding Spark NLP to your Scala or Java project is easy:
@@ -370,6 +511,258 @@ Run the following code in Kaggle Kernel and start using spark-nlp right away.
 
 </div><div class="h3-box" markdown="1">
 
+## Apache Zeppelin
+
+Use either one of the following options
+
+- Add the following Maven Coordinates to the interpreter's library list
+
+```bash
+com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0
+```
+
+- Add a path to pre-built jar from [here](#compiled-jars) in the interpreter's library list making sure the jar is
+  available to driver path
+
+## Python in Zeppelin
+
+Apart from the previous step, install the python module through pip
+
+```bash
+pip install spark-nlp==5.4.0
+```
+
+Or you can install `spark-nlp` from inside Zeppelin by using Conda:
+
+```bash
+python.conda install -c johnsnowlabs spark-nlp
+```
+
+Configure Zeppelin properly, use cells with %spark.pyspark or any interpreter name you chose.
+
+Finally, in Zeppelin interpreter settings, make sure you set properly zeppelin.python to the python you want to use and
+install the pip library with (e.g. `python3`).
+
+An alternative option would be to set `SPARK_SUBMIT_OPTIONS` (zeppelin-env.sh) and make sure `--packages` is there as
+shown earlier since it includes both scala and python side installation.
+
+## Jupyter Notebook
+
+**Recommended:**
+
+The easiest way to get this done on Linux and macOS is to simply install `spark-nlp` and `pyspark` PyPI packages and
+launch the Jupyter from the same Python environment:
+
+```sh
+$ conda create -n sparknlp python=3.8 -y
+$ conda activate sparknlp
+# spark-nlp by default is based on pyspark 3.x
+$ pip install spark-nlp==5.4.0 pyspark==3.3.1 jupyter
+$ jupyter notebook
+```
+
+Then you can use `python3` kernel to run your code with creating SparkSession via `spark = sparknlp.start()`.
+
+**Optional:**
+
+If you are in different operating systems and require to make Jupyter Notebook run by using pyspark, you can follow
+these steps:
+
+```bash
+export SPARK_HOME=/path/to/your/spark/folder
+export PYSPARK_PYTHON=python3
+export PYSPARK_DRIVER_PYTHON=jupyter
+export PYSPARK_DRIVER_PYTHON_OPTS=notebook
+
+pyspark --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0
+```
+
+Alternatively, you can mix in using `--jars` option for pyspark + `pip install spark-nlp`
+
+If not using pyspark at all, you'll have to run the instructions
+pointed [here](#python-without-explicit-pyspark-installation)
+
+## Databricks Cluster
+
+1. Create a cluster if you don't have one already
+
+2. On a new cluster or existing one you need to add the following to the `Advanced Options -> Spark` tab:
+
+    ```bash
+    spark.kryoserializer.buffer.max 2000M
+    spark.serializer org.apache.spark.serializer.KryoSerializer
+    ```
+
+3. In `Libraries` tab inside your cluster you need to follow these steps:
+
+   3.1. Install New -> PyPI -> `spark-nlp==5.4.0` -> Install
+
+   3.2. Install New -> Maven -> Coordinates -> `com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0` -> Install
+
+4. Now you can attach your notebook to the cluster and use Spark NLP!
+
+NOTE: Databricks' runtimes support different Apache Spark major releases. Please make sure you choose the correct Spark
+NLP Maven package name (Maven Coordinate) for your runtime from
+our [Packages Cheatsheet](https://github.com/JohnSnowLabs/spark-nlp#packages-cheatsheet)
+
+## EMR Cluster
+
+To launch EMR clusters with Apache Spark/PySpark and Spark NLP correctly you need to have bootstrap and software
+configuration.
+
+A sample of your bootstrap script
+
+```.sh
+#!/bin/bash
+set -x -e
+
+echo -e 'export PYSPARK_PYTHON=/usr/bin/python3
+export HADOOP_CONF_DIR=/etc/hadoop/conf
+export SPARK_JARS_DIR=/usr/lib/spark/jars
+export SPARK_HOME=/usr/lib/spark' >> $HOME/.bashrc && source $HOME/.bashrc
+
+sudo python3 -m pip install awscli boto spark-nlp
+
+set +x
+exit 0
+
+```
+
+A sample of your software configuration in JSON on S3 (must be public access):
+
+```.json
+[{
+  "Classification": "spark-env",
+  "Configurations": [{
+    "Classification": "export",
+    "Properties": {
+      "PYSPARK_PYTHON": "/usr/bin/python3"
+    }
+  }]
+},
+{
+  "Classification": "spark-defaults",
+    "Properties": {
+      "spark.yarn.stagingDir": "hdfs:///tmp",
+      "spark.yarn.preserve.staging.files": "true",
+      "spark.kryoserializer.buffer.max": "2000M",
+      "spark.serializer": "org.apache.spark.serializer.KryoSerializer",
+      "spark.driver.maxResultSize": "0",
+      "spark.jars.packages": "com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0"
+    }
+}]
+```
+
+A sample of AWS CLI to launch EMR cluster:
+
+```.sh
+aws emr create-cluster \
+--name "Spark NLP 5.4.0" \
+--release-label emr-6.2.0 \
+--applications Name=Hadoop Name=Spark Name=Hive \
+--instance-type m4.4xlarge \
+--instance-count 3 \
+--use-default-roles \
+--log-uri "s3://<S3_BUCKET>/" \
+--bootstrap-actions Path=s3://<S3_BUCKET>/emr-bootstrap.sh,Name=custome \
+--configurations "https://<public_access>/sparknlp-config.json" \
+--ec2-attributes KeyName=<your_ssh_key>,EmrManagedMasterSecurityGroup=<security_group_with_ssh>,EmrManagedSlaveSecurityGroup=<security_group_with_ssh> \
+--profile <aws_profile_credentials>
+```
+
+## GCP Dataproc
+
+1. Create a cluster if you don't have one already as follows.
+
+At gcloud shell:
+
+```bash
+gcloud services enable dataproc.googleapis.com \
+  compute.googleapis.com \
+  storage-component.googleapis.com \
+  bigquery.googleapis.com \
+  bigquerystorage.googleapis.com
+```
+
+```bash
+REGION=<region>
+```
+
+```bash
+BUCKET_NAME=<bucket_name>
+gsutil mb -c standard -l ${REGION} gs://${BUCKET_NAME}
+```
+
+```bash
+REGION=<region>
+ZONE=<zone>
+CLUSTER_NAME=<cluster_name>
+BUCKET_NAME=<bucket_name>
+```
+
+You can set image-version, master-machine-type, worker-machine-type,
+master-boot-disk-size, worker-boot-disk-size, num-workers as your needs.
+If you use the previous image-version from 2.0, you should also add ANACONDA to optional-components.
+And, you should enable gateway.
+Don't forget to set the maven coordinates for the jar in properties.
+
+```bash
+gcloud dataproc clusters create ${CLUSTER_NAME} \
+  --region=${REGION} \
+  --zone=${ZONE} \
+  --image-version=2.0 \
+  --master-machine-type=n1-standard-4 \
+  --worker-machine-type=n1-standard-2 \
+  --master-boot-disk-size=128GB \
+  --worker-boot-disk-size=128GB \
+  --num-workers=2 \
+  --bucket=${BUCKET_NAME} \
+  --optional-components=JUPYTER \
+  --enable-component-gateway \
+  --metadata 'PIP_PACKAGES=spark-nlp spark-nlp-display google-cloud-bigquery google-cloud-storage' \
+  --initialization-actions gs://goog-dataproc-initialization-actions-${REGION}/python/pip-install.sh \
+  --properties spark:spark.serializer=org.apache.spark.serializer.KryoSerializer,spark:spark.driver.maxResultSize=0,spark:spark.kryoserializer.buffer.max=2000M,spark:spark.jars.packages=com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0
+```
+
+2. On an existing one, you need to install spark-nlp and spark-nlp-display packages from PyPI.
+
+3. Now, you can attach your notebook to the cluster and use the Spark NLP!
+
+
+## Apache Spark Support
+
+Spark NLP *5.4.0* has been built on top of Apache Spark 3.4 while fully supports Apache Spark 3.0.x, 3.1.x, 3.2.x, 3.3.x, 3.4.x, and 3.5.x
+
+| Spark NLP | Apache Spark 3.5.x | Apache Spark 3.4.x | Apache Spark 3.3.x | Apache Spark 3.2.x | Apache Spark 3.1.x | Apache Spark 3.0.x | Apache Spark 2.4.x | Apache Spark 2.3.x |
+|-----------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|
+| 5.4.x     | YES                | YES                | YES                | YES                | YES                | YES                | NO                 | NO                 |
+| 5.3.x     | YES                | YES                | YES                | YES                | YES                | YES                | NO                 | NO                 |
+| 5.2.x     | YES                | YES                | YES                | YES                | YES                | YES                | NO                 | NO                 |
+| 5.1.x     | Partially          | YES                | YES                | YES                | YES                | YES                | NO                 | NO                 |
+| 5.0.x     | YES                | YES                | YES                | YES                | YES                | YES                | NO                 | NO                 |
+| 4.4.x     | YES                | YES                | YES                | YES                | YES                | YES                | NO                 | NO                 |
+| 4.3.x     | NO                 | NO                 | YES                | YES                | YES                | YES                | NO                 | NO                 |
+| 4.2.x     | NO                 | NO                 | YES                | YES                | YES                | YES                | NO                 | NO                 |
+| 4.1.x     | NO                 | NO                 | YES                | YES                | YES                | YES                | NO                 | NO                 |
+| 4.0.x     | NO                 | NO                 | YES                | YES                | YES                | YES                | NO                 | NO                 |
+
+Find out more about `Spark NLP` versions from our [release notes](https://github.com/JohnSnowLabs/spark-nlp/releases).
+
+## Scala and Python Support
+
+| Spark NLP | Python 3.6 | Python 3.7 | Python 3.8 | Python 3.9 | Python 3.10| Scala 2.11 | Scala 2.12 |
+|-----------|------------|------------|------------|------------|------------|------------|------------|
+| 5.3.x     | NO         | YES        | YES        | YES        | YES        | NO         | YES        |
+| 5.2.x     | NO         | YES        | YES        | YES        | YES        | NO         | YES        |
+| 5.1.x     | NO         | YES        | YES        | YES        | YES        | NO         | YES        |
+| 5.0.x     | NO         | YES        | YES        | YES        | YES        | NO         | YES        |
+| 4.4.x     | NO         | YES        | YES        | YES        | YES        | NO         | YES        |
+| 4.3.x     | YES        | YES        | YES        | YES        | YES        | NO         | YES        |
+| 4.2.x     | YES        | YES        | YES        | YES        | YES        | NO         | YES        |
+| 4.1.x     | YES        | YES        | YES        | YES        | NO         | NO         | YES        |
+| 4.0.x     | YES        | YES        | YES        | YES        | NO         | NO         | YES        |
+
+
 ## Databricks Support
 
 Spark NLP 5.4.0 has been tested and is compatible with the following runtimes:
@@ -867,4 +1260,44 @@ PipelineModel.load("/tmp/explain_document_dl_en_2.0.2_2.4_1556530585689/")
 - Since you are downloading and loading models/pipelines manually, this means Spark NLP is not downloading the most recent and compatible models/pipelines for you. Choosing the right model/pipeline is on you
 - If you are local, you can load the model/pipeline from your local FileSystem, however, if you are in a cluster setup you need to put the model/pipeline on a distributed FileSystem such as HDFS, DBFS, S3, etc. (i.e., `hdfs:///tmp/explain_document_dl_en_2.0.2_2.4_1556530585689/`)
 
+
+## Compiled JARs
+
+### Build from source
+
+#### spark-nlp
+
+- FAT-JAR for CPU on Apache Spark 3.0.x, 3.1.x, 3.2.x, 3.3.x, 3.4.x, and 3.5.x
+
+```bash
+sbt assembly
+```
+
+- FAT-JAR for GPU on Apache Spark 3.0.x, 3.1.x, 3.2.x, 3.3.x, 3.4.x, and 3.5.x
+
+```bash
+sbt -Dis_gpu=true assembly
+```
+
+- FAT-JAR for M! on Apache Spark 3.0.x, 3.1.x, 3.2.x, 3.3.x, 3.4.x, and 3.5.x
+
+```bash
+sbt -Dis_silicon=true assembly
+```
+
+### Using the jar manually
+
+If for some reason you need to use the JAR, you can either download the Fat JARs provided here or download it
+from [Maven Central](https://mvnrepository.com/artifact/com.johnsnowlabs.nlp).
+
+To add JARs to spark programs use the `--jars` option:
+
+```sh
+spark-shell --jars spark-nlp.jar
+```
+
+The preferred way to use the library when running spark programs is using the `--packages` option as specified in
+the `spark-packages` section.
+
+
 </div>
diff --git a/docs/en/pipelines.md b/docs/en/pipelines.md
index 43728d43863270..0204f8c62b88f9 100644
--- a/docs/en/pipelines.md
+++ b/docs/en/pipelines.md
@@ -5,7 +5,7 @@ seotitle: Spark NLP - Pipelines
 title: Spark NLP - Pipelines
 permalink: /docs/en/pipelines
 key: docs-pipelines
-modify_date: "2021-11-20"
+modify_date: "2024-07-04"
 show_nav: true
 sidebar:
     nav: sparknlp
@@ -13,96 +13,24 @@ sidebar:
 
 <div class="h3-box" markdown="1">
 
-Pretrained Pipelines have moved to Models Hub.
-Please follow this link for the updated list of all models and pipelines:
-[Models Hub](https://sparknlp.org/models)
-{:.success}
-
-</div><div class="h3-box" markdown="1">
-
-## English
-
-**NOTE:**
-`noncontrib` pipelines are compatible with `Windows` operating systems.
-
-{:.table-model-big}
-| Pipelines            | Name                   |
-| -------------------- | ---------------------- |
-| [Explain Document ML](#explaindocumentml)  | `explain_document_ml`
-| [Explain Document DL](#explaindocumentdl)  | `explain_document_dl`
-| [Explain Document DL Win]() | `explain_document_dl_noncontrib`
-| Explain Document DL Fast | `explain_document_dl_fast`
-| Explain Document DL Fast Win | `explain_document_dl_fast_noncontrib`  |
-| [Recognize Entities DL](#recognizeentitiesdl) | `recognize_entities_dl` |
-| Recognize Entities DL Win | `recognize_entities_dl_noncontrib` |
-| [OntoNotes Entities Small](#ontorecognizeentitiessm) | `onto_recognize_entities_sm` |
-| [OntoNotes Entities Large](#ontorecognizeentitieslg) | `onto_recognize_entities_lg` |
-| [Match Datetime](#matchdatetime) | `match_datetime` |
-| [Match Pattern](#matchpattern) | `match_pattern` |
-| [Match Chunk](#matchchunks) | `match_chunks` |
-| Match Phrases | `match_phrases`|
-| Clean Stop | `clean_stop`|
-| Clean Pattern | `clean_pattern`|
-| Clean Slang | `clean_slang`|
-| Check Spelling | `check_spelling`|
-| Analyze Sentiment | `analyze_sentiment` |
-| Analyze Sentiment DL | `analyze_sentimentdl_use_imdb` |
-| Analyze Sentiment DL | `analyze_sentimentdl_use_twitter` |
-| Dependency Parse | `dependency_parse` |
-
-</div><div class="h3-box" markdown="1">
-
-### explain_document_ml
-
-{% highlight scala %}
-import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
-import com.johnsnowlabs.nlp.SparkNLP
-
-SparkNLP.version()
-
-val testData = spark.createDataFrame(Seq(
-(1, "Google has announced the release of a beta version of the popular TensorFlow machine learning library"),
-(2, "The Paris metro will soon enter the 21st century, ditching single-use paper tickets for rechargeable electronic cards.")
-)).toDF("id", "text")
+## Pipelines and Models
 
-val pipeline = PretrainedPipeline("explain_document_ml", lang="en")
+### Pipelines
 
-val annotation = pipeline.transform(testData)
-
-annotation.show()
-
-/*
-2.0.8
-testData: org.apache.spark.sql.DataFrame = [id: int, text: string]
-pipeline: com.johnsnowlabs.nlp.pretrained.PretrainedPipeline = PretrainedPipeline(explain_document_ml,en,public/models)
-annotation: org.apache.spark.sql.DataFrame = [id: int, text: string ... 7 more fields]
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
-| id|                text|            document|            sentence|               token|             checked|              lemmas|               stems|                 pos|
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
-|  1|Google has announ...|[[document, 0, 10...|[[document, 0, 10...|[[token, 0, 5, Go...|[[token, 0, 5, Go...|[[token, 0, 5, Go...|[[token, 0, 5, go...|[[pos, 0, 5, NNP,...|
-|  2|The Paris metro w...|[[document, 0, 11...|[[document, 0, 11...|[[token, 0, 2, Th...|[[token, 0, 2, Th...|[[token, 0, 2, Th...|[[token, 0, 2, th...|[[pos, 0, 2, DT, ...|
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
-*/
-
-{% endhighlight %}
-
-</div><div class="h3-box" markdown="1">
-
-### explain_document_dl
-
-{% highlight scala %}
+**Quick example:**
 
+```scala
 import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
 import com.johnsnowlabs.nlp.SparkNLP
 
 SparkNLP.version()
 
 val testData = spark.createDataFrame(Seq(
-(1, "Google has announced the release of a beta version of the popular TensorFlow machine learning library"),
-(2, "Donald John Trump (born June 14, 1946) is the 45th and current president of the United States")
+  (1, "Google has announced the release of a beta version of the popular TensorFlow machine learning library"),
+  (2, "Donald John Trump (born June 14, 1946) is the 45th and current president of the United States")
 )).toDF("id", "text")
 
-val pipeline = PretrainedPipeline("explain_document_dl", lang="en")
+val pipeline = PretrainedPipeline("explain_document_dl", lang = "en")
 
 val annotation = pipeline.transform(testData)
 
@@ -110,7 +38,7 @@ annotation.show()
 /*
 import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
 import com.johnsnowlabs.nlp.SparkNLP
-2.0.8
+2.5.0
 testData: org.apache.spark.sql.DataFrame = [id: int, text: string]
 pipeline: com.johnsnowlabs.nlp.pretrained.PretrainedPipeline = PretrainedPipeline(explain_document_dl,en,public/models)
 annotation: org.apache.spark.sql.DataFrame = [id: int, text: string ... 10 more fields]
@@ -132,888 +60,141 @@ annotation.select("entities.result").show(false)
 |[Donald John Trump, United States]|
 +----------------------------------+
 */
+```
 
-{% endhighlight %}
+#### Showing Available Pipelines
 
-</div><div class="h3-box" markdown="1">
+There are functions in Spark NLP that will list all the available Pipelines
+of a particular language for you:
 
-### recognize_entities_dl
-
-{% highlight scala %}
-
-import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
-import com.johnsnowlabs.nlp.SparkNLP
-
-SparkNLP.version()
-
-val testData = spark.createDataFrame(Seq(
-(1, "Google has announced the release of a beta version of the popular TensorFlow machine learning library"),
-(2, "Donald John Trump (born June 14, 1946) is the 45th and current president of the United States")
-)).toDF("id", "text")
-
-val pipeline = PretrainedPipeline("recognize_entities_dl", lang="en")
-
-val annotation = pipeline.transform(testData)
-
-annotation.show()
-
-/*
-import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
-import com.johnsnowlabs.nlp.SparkNLP
-2.0.8
-testData: org.apache.spark.sql.DataFrame = [id: int, text: string]
-pipeline: com.johnsnowlabs.nlp.pretrained.PretrainedPipeline = PretrainedPipeline(entity_recognizer_dl,en,public/models)
-annotation: org.apache.spark.sql.DataFrame = [id: int, text: string ... 6 more fields]
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
-| id|                text|            document|            sentence|               token|          embeddings|                 ner|       ner_converter|
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
-|  1|Google has announ...|[[document, 0, 10...|[[document, 0, 10...|[[token, 0, 5, Go...|[[word_embeddings...|[[named_entity, 0...|[[chunk, 0, 5, Go...|
-|  2|Donald John Trump...|[[document, 0, 92...|[[document, 0, 92...|[[token, 0, 5, Do...|[[word_embeddings...|[[named_entity, 0...|[[chunk, 0, 16, D...|
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
-*/
-
-annotation.select("entities.result").show(false)
+```scala
+import com.johnsnowlabs.nlp.pretrained.ResourceDownloader
 
+ResourceDownloader.showPublicPipelines(lang = "en")
 /*
-+----------------------------------+
-|result                            |
-+----------------------------------+
-|[Google, TensorFlow]              |
-|[Donald John Trump, United States]|
-+----------------------------------+
++--------------------------------------------+------+---------+
+| Pipeline                                   | lang | version |
++--------------------------------------------+------+---------+
+| dependency_parse                           |  en  | 2.0.2   |
+| analyze_sentiment_ml                       |  en  | 2.0.2   |
+| check_spelling                             |  en  | 2.1.0   |
+| match_datetime                             |  en  | 2.1.0   |
+                               ...
+| explain_document_ml                        |  en  | 3.1.3   |
++--------------------------------------------+------+---------+
 */
+```
 
-{% endhighlight %}
-
-</div><div class="h3-box" markdown="1">
+Or if we want to check for a particular version:
 
-### onto_recognize_entities_sm
-
-Trained by **NerDLApproach** annotator with **Char CNNs - BiLSTM - CRF** and **GloVe Embeddings** on the **OntoNotes** corpus and supports the identification of 18 entities.
-
-{% highlight scala %}
-
-import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
-import com.johnsnowlabs.nlp.SparkNLP
-
-SparkNLP.version()
-
-val testData = spark.createDataFrame(Seq(
-(1, "Johnson first entered politics when elected in 2001 as a member of Parliament. He then served eight years as the mayor of London, from 2008 to 2016, before rejoining Parliament. "),
-(2, "A little less than a decade later, dozens of self-driving startups have cropped up while automakers around the world clamor, wallet in hand, to secure their place in the fast-moving world of fully automated transportation.")
-)).toDF("id", "text")
-
-val pipeline = PretrainedPipeline("onto_recognize_entities_sm", lang="en")
-
-val annotation = pipeline.transform(testData)
-
-annotation.show()
-
-/*
-import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
-import com.johnsnowlabs.nlp.SparkNLP
-2.1.0
-testData: org.apache.spark.sql.DataFrame = [id: int, text: string]
-pipeline: com.johnsnowlabs.nlp.pretrained.PretrainedPipeline = PretrainedPipeline(onto_recognize_entities_sm,en,public/models)
-annotation: org.apache.spark.sql.DataFrame = [id: int, text: string ... 6 more fields]
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
-| id|                text|            document|               token|          embeddings|                 ner|            entities|
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
-|  1|Johnson first ent...|[[document, 0, 17...|[[token, 0, 6, Jo...|[[word_embeddings...|[[named_entity, 0...|[[chunk, 0, 6, Jo...|
-|  2|A little less tha...|[[document, 0, 22...|[[token, 0, 0, A,...|[[word_embeddings...|[[named_entity, 0...|[[chunk, 0, 32, A...|
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
-*/
-
-annotation.select("entities.result").show(false)
+```scala
+import com.johnsnowlabs.nlp.pretrained.ResourceDownloader
 
+ResourceDownloader.showPublicPipelines(lang = "en", version = "3.1.0")
 /*
-+---------------------------------------------------------------------------------+
-|result                                                                           |
-+---------------------------------------------------------------------------------+
-|[Johnson, first, 2001, Parliament, eight years, London, 2008 to 2016, Parliament]|
-|[A little less than a decade later, dozens]                                      |
-+---------------------------------------------------------------------------------+
++---------------------------------------+------+---------+
+| Pipeline                              | lang | version |
++---------------------------------------+------+---------+
+| dependency_parse                      |  en  | 2.0.2   |
+                               ...
+| clean_slang                           |  en  | 3.0.0   |
+| clean_pattern                         |  en  | 3.0.0   |
+| check_spelling                        |  en  | 3.0.0   |
+| dependency_parse                      |  en  | 3.0.0   |
++---------------------------------------+------+---------+
 */
+```
 
-{% endhighlight %}
+#### Please check out our Models Hub for the full list of [pre-trained pipelines](https://sparknlp.org/models) with examples, demos, benchmarks, and more
 
-</div><div class="h3-box" markdown="1">
+### Models
 
-### onto_recognize_entities_lg
+**Some selected languages:
+** `Afrikaans, Arabic, Armenian, Basque, Bengali, Breton, Bulgarian, Catalan, Czech, Dutch, English, Esperanto, Finnish, French, Galician, German, Greek, Hausa, Hebrew, Hindi, Hungarian, Indonesian, Irish, Italian, Japanese, Latin, Latvian, Marathi, Norwegian, Persian, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Somali, Southern Sotho, Spanish, Swahili, Swedish, Tswana, Turkish, Ukrainian, Zulu`
 
-Trained by **NerDLApproach** annotator with **Char CNNs - BiLSTM - CRF** and **GloVe Embeddings** on the **OntoNotes** corpus and supports the identification of 18 entities.
+**Quick online example:**
 
-{% highlight scala %}
+```python
+# load NER model trained by deep learning approach and GloVe word embeddings
+ner_dl = NerDLModel.pretrained('ner_dl')
+# load NER model trained by deep learning approach and BERT word embeddings
+ner_bert = NerDLModel.pretrained('ner_dl_bert')
+```
 
-import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
-import com.johnsnowlabs.nlp.SparkNLP
+```scala
+// load French POS tagger model trained by Universal Dependencies
+val french_pos = PerceptronModel.pretrained("pos_ud_gsd", lang = "fr")
+// load Italian LemmatizerModel
+val italian_lemma = LemmatizerModel.pretrained("lemma_dxc", lang = "it")
+````
 
-SparkNLP.version()
+**Quick offline example:**
 
-val testData = spark.createDataFrame(Seq(
-(1, "Johnson first entered politics when elected in 2001 as a member of Parliament. He then served eight years as the mayor of London, from 2008 to 2016, before rejoining Parliament. "),
-(2, "A little less than a decade later, dozens of self-driving startups have cropped up while automakers around the world clamor, wallet in hand, to secure their place in the fast-moving world of fully automated transportation.")
-)).toDF("id", "text")
-
-val pipeline = PretrainedPipeline("onto_recognize_entities_lg", lang="en")
+- Loading `PerceptronModel` annotator model inside Spark NLP Pipeline
 
-val annotation = pipeline.transform(testData)
+```scala
+val french_pos = PerceptronModel.load("/tmp/pos_ud_gsd_fr_2.0.2_2.4_1556531457346/")
+  .setInputCols("document", "token")
+  .setOutputCol("pos")
+```
 
-annotation.show()
+#### Showing Available Models
 
-/*
-import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
-import com.johnsnowlabs.nlp.SparkNLP
-2.1.0
-testData: org.apache.spark.sql.DataFrame = [id: int, text: string]
-pipeline: com.johnsnowlabs.nlp.pretrained.PretrainedPipeline = PretrainedPipeline(onto_recognize_entities_lg,en,public/models)
-annotation: org.apache.spark.sql.DataFrame = [id: int, text: string ... 6 more fields]
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
-| id|                text|            document|               token|          embeddings|                 ner|            entities|
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
-|  1|Johnson first ent...|[[document, 0, 17...|[[token, 0, 6, Jo...|[[word_embeddings...|[[named_entity, 0...|[[chunk, 0, 6, Jo...|
-|  2|A little less tha...|[[document, 0, 22...|[[token, 0, 0, A,...|[[word_embeddings...|[[named_entity, 0...|[[chunk, 0, 32, A...|
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
-*/
+There are functions in Spark NLP that will list all the available Models
+of a particular Annotator and language for you:
 
-annotation.select("entities.result").show(false)
+```scala
+import com.johnsnowlabs.nlp.pretrained.ResourceDownloader
 
+ResourceDownloader.showPublicModels(annotator = "NerDLModel", lang = "en")
 /*
-+-------------------------------------------------------------------------------+
-|result                                                                         |
-+-------------------------------------------------------------------------------+
-|[Johnson, first, 2001, Parliament, eight years, London, 2008, 2016, Parliament]|
-|[A little less than a decade later, dozens]                                    |
-+-------------------------------------------------------------------------------+
++---------------------------------------------+------+---------+
+| Model                                       | lang | version |
++---------------------------------------------+------+---------+
+| onto_100                                    |  en  | 2.1.0   |
+| onto_300                                    |  en  | 2.1.0   |
+| ner_dl_bert                                 |  en  | 2.2.0   |
+| onto_100                                    |  en  | 2.4.0   |
+| ner_conll_elmo                              |  en  | 3.2.2   |
++---------------------------------------------+------+---------+
 */
+```
 
-{% endhighlight %}
-
-</div><div class="h3-box" markdown="1">
-
-### match_datetime
-
-#### DateMatcher yyyy/MM/dd
-
-{% highlight scala %}
-
-import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
-import com.johnsnowlabs.nlp.SparkNLP
-
-SparkNLP.version()
-
-val testData = spark.createDataFrame(Seq(
-(1, "I would like to come over and see you in 01/02/2019."),
-(2, "Donald John Trump (born June 14, 1946) is the 45th and current president of the United States")
-)).toDF("id", "text")
+Or if we want to check for a particular version:
 
-val pipeline = PretrainedPipeline("match_datetime", lang="en")
-
-val annotation = pipeline.transform(testData)
-
-annotation.show()
+```scala
+import com.johnsnowlabs.nlp.pretrained.ResourceDownloader
 
+ResourceDownloader.showPublicModels(annotator = "NerDLModel", lang = "en", version = "3.1.0")
 /*
-import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
-import com.johnsnowlabs.nlp.SparkNLP
-2.0.8
-testData: org.apache.spark.sql.DataFrame = [id: int, text: string]
-pipeline: com.johnsnowlabs.nlp.pretrained.PretrainedPipeline = PretrainedPipeline(match_datetime,en,public/models)
-annotation: org.apache.spark.sql.DataFrame = [id: int, text: string ... 4 more fields]
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+
-| id|                text|            document|            sentence|               token|                date|
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+
-|  1|I would like to c...|[[document, 0, 51...|[[document, 0, 51...|[[token, 0, 0, I,...|[[date, 41, 50, 2...|
-|  2|Donald John Trump...|[[document, 0, 92...|[[document, 0, 92...|[[token, 0, 5, Do...|[[date, 24, 36, 1...|
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+
++----------------------------+------+---------+
+| Model                      | lang | version |
++----------------------------+------+---------+
+| onto_100                   |  en  | 2.1.0   |
+| ner_aspect_based_sentiment |  en  | 2.6.2   |
+| ner_weibo_glove_840B_300d  |  en  | 2.6.2   |
+| nerdl_atis_840b_300d       |  en  | 2.7.1   |
+| nerdl_snips_100d           |  en  | 2.7.3   |
++----------------------------+------+---------+
 */
+```
 
-annotation.select("date.result").show(false)
+And to see a list of available annotators, you can use:
 
-/*
-+------------+
-|result      |
-+------------+
-|[2019/01/02]|
-|[1946/06/14]|
-+------------+
-*/
-
-{% endhighlight %}
-
-</div><div class="h3-box" markdown="1">
-
-### match_pattern
-
-RegexMatcher (match phone numbers)
-
-{% highlight scala %}
-
-import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
-import com.johnsnowlabs.nlp.SparkNLP
-
-SparkNLP.version()
-
-val testData = spark.createDataFrame(Seq(
-(1, "You should call Mr. Jon Doe at +33 1 79 01 22 89")
-)).toDF("id", "text")
-
-val pipeline = PretrainedPipeline("match_pattern", lang="en")
-
-val annotation = pipeline.transform(testData)
-
-annotation.show()
-
-/*
-import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
-import com.johnsnowlabs.nlp.SparkNLP
-2.0.8
-testData: org.apache.spark.sql.DataFrame = [id: int, text: string]
-pipeline: com.johnsnowlabs.nlp.pretrained.PretrainedPipeline = PretrainedPipeline(match_pattern,en,public/models)
-annotation: org.apache.spark.sql.DataFrame = [id: int, text: string ... 4 more fields]
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+
-| id|                text|            document|            sentence|               token|               regex|
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+
-|  1|You should call M...|[[document, 0, 47...|[[document, 0, 47...|[[token, 0, 2, Yo...|[[chunk, 31, 47, ...|
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+
-*/
-
-annotation.select("regex.result").show(false)
-
-/*
-+-------------------+
-|result             |
-+-------------------+
-|[+33 1 79 01 22 89]|
-+-------------------+
-*/
-
-{% endhighlight %}
-
-</div><div class="h3-box" markdown="1">
-
-### match_chunks
-
-The pipeline uses regex `<DT/>?/<JJ/>*<NN>+`
-
-{% highlight scala %}
-
-import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
-import com.johnsnowlabs.nlp.SparkNLP
-
-SparkNLP.version()
-
-val testData = spark.createDataFrame(Seq(
-(1, "The book has many chapters"),
-(2, "the little yellow dog barked at the cat")
-)).toDF("id", "text")
-
-val pipeline = PretrainedPipeline("match_chunks", lang="en")
-
-val annotation = pipeline.transform(testData)
-
-annotation.show()
-
-/*
-import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
-import com.johnsnowlabs.nlp.SparkNLP
-2.0.8
-testData: org.apache.spark.sql.DataFrame = [id: int, text: string]
-pipeline: com.johnsnowlabs.nlp.pretrained.PretrainedPipeline = PretrainedPipeline(match_chunks,en,public/models)
-annotation: org.apache.spark.sql.DataFrame = [id: int, text: string ... 5 more fields]
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
-| id|                text|            document|            sentence|               token|                 pos|               chunk|
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
-|  1|The book has many...|[[document, 0, 25...|[[document, 0, 25...|[[token, 0, 2, Th...|[[pos, 0, 2, DT, ...|[[chunk, 0, 7, Th...|
-|  2|the little yellow...|[[document, 0, 38...|[[document, 0, 38...|[[token, 0, 2, th...|[[pos, 0, 2, DT, ...|[[chunk, 0, 20, t...|
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
-*/
-
-annotation.select("chunk.result").show(false)
-
-/*
-+--------------------------------+
-|result                          |
-+--------------------------------+
-|[The book]                      |
-|[the little yellow dog, the cat]|
-+--------------------------------+
-*/
-
-{% endhighlight %}
-
-</div><div class="h3-box" markdown="1">
-
-## French
-
-{:.table-model-big}
-| Pipelines               | Name                  |
-| ----------------------- | --------------------- |
-| [Explain Document Large](#french-explain_document_lg)  | `explain_document_lg` |
-| [Explain Document Medium](#french-explain_document_md) | `explain_document_md` |
-| [Entity Recognizer Large](#french-entity_recognizer_lg) | `entity_recognizer_lg` |
-| [Entity Recognizer Medium](#french-entity_recognizer_md) | `entity_recognizer_md` |
-
-{:.table-model-big}
-|Feature | Description|
-|---|----|
-|**NER**|Trained by **NerDLApproach** annotator with **Char CNNs - BiLSTM - CRF** and **GloVe Embeddings** on the **WikiNER** corpus and supports the identification of `PER`, `LOC`, `ORG` and `MISC` entities
-|**Lemma**|Trained by **Lemmatizer** annotator on **lemmatization-lists** by `Michal Měchura`
-|**POS**| Trained by **PerceptronApproach** annotator on the [Universal Dependencies](https://universaldependencies.org/treebanks/fr_gsd/index.html)
-|**Size**| Model size indicator, **md** and **lg**. The large pipeline uses **glove_840B_300** and the medium uses **glove_6B_300** WordEmbeddings
-
-</div><div class="h3-box" markdown="1">
-
-### French explain_document_lg
-
-{% highlight scala %}
-
-import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
-import com.johnsnowlabs.nlp.SparkNLP
-
-SparkNLP.version()
-
-val pipeline = PretrainedPipeline("explain_document_lg", lang="fr")
-
-val testData = spark.createDataFrame(Seq(
-(1, "Contrairement à Quentin Tarantino, le cinéma français ne repart pas les mains vides de la compétition cannoise."),
-(2, "Emmanuel Jean-Michel Frédéric Macron est le fils de Jean-Michel Macron, né en 1950, médecin, professeur de neurologie au CHU d'Amiens4 et responsable d'enseignement à la faculté de médecine de cette même ville5, et de Françoise Noguès, médecin conseil à la Sécurité sociale")
-)).toDF("id", "text")
-
-val annotation = pipeline.transform(testData)
-
-annotation.show()
+```scala
+import com.johnsnowlabs.nlp.pretrained.ResourceDownloader
 
+ResourceDownloader.showAvailableAnnotators()
 /*
-import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
-import com.johnsnowlabs.nlp.SparkNLP
-2.0.8
-pipeline: com.johnsnowlabs.nlp.pretrained.PretrainedPipeline = PretrainedPipeline(explain_document_lg,fr,public/models)
-testData: org.apache.spark.sql.DataFrame = [id: bigint, text: string]
-annotation: org.apache.spark.sql.DataFrame = [id: bigint, text: string ... 8 more fields]
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
-| id|                text|            document|               token|            sentence|               lemma|                 pos|          embeddings|                 ner|            entities|
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
-|  0|Contrairement à Q...|[[document, 0, 11...|[[token, 0, 12, C...|[[document, 0, 11...|[[token, 0, 12, C...|[[pos, 0, 12, ADV...|[[word_embeddings...|[[named_entity, 0...|[[chunk, 16, 32, ...|
-|  1|Emmanuel Jean-Mic...|[[document, 0, 27...|[[token, 0, 7, Em...|[[document, 0, 27...|[[token, 0, 7, Em...|[[pos, 0, 7, PROP...|[[word_embeddings...|[[named_entity, 0...|[[chunk, 0, 35, E...|
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
+AlbertEmbeddings
+AlbertForTokenClassification
+AssertionDLModel
+...
+XlmRoBertaSentenceEmbeddings
+XlnetEmbeddings
 */
+```
 
-annotation.select("entities.result").show(false)
-
-/*+-------------------------------------------------------------------------------------------------------------+
-|result                                                                                                       |
-+-------------------------------------------------------------------------------------------------------------+
-|[Quentin Tarantino]                                                                                          |
-|[Emmanuel Jean-Michel Frédéric Macron, Jean-Michel Macron, CHU d'Amiens4, Françoise Noguès, Sécurité sociale]|
-+-------------------------------------------------------------------------------------------------------------+
-*/
-
-{% endhighlight %}
-
-</div><div class="h3-box" markdown="1">
-
-### French explain_document_md
-
-{% highlight scala %}
-
-import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
-import com.johnsnowlabs.nlp.SparkNLP
-
-SparkNLP.version()
-
-val pipeline = PretrainedPipeline("explain_document_md", lang="fr")
-
-val testData = spark.createDataFrame(Seq(
-(1, "Contrairement à Quentin Tarantino, le cinéma français ne repart pas les mains vides de la compétition cannoise."),
-(2, "Emmanuel Jean-Michel Frédéric Macron est le fils de Jean-Michel Macron, né en 1950, médecin, professeur de neurologie au CHU d'Amiens4 et responsable d'enseignement à la faculté de médecine de cette même ville5, et de Françoise Noguès, médecin conseil à la Sécurité sociale")
-)).toDF("id", "text")
-
-val annotation = pipeline.transform(testData)
-
-annotation.show()
-
-/*
-import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
-import com.johnsnowlabs.nlp.SparkNLP
-2.0.8
-pipeline: com.johnsnowlabs.nlp.pretrained.PretrainedPipeline = PretrainedPipeline(explain_document_md,fr,public/models)
-testData: org.apache.spark.sql.DataFrame = [id: bigint, text: string]
-annotation: org.apache.spark.sql.DataFrame = [id: bigint, text: string ... 8 more fields]
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
-| id|                text|            document|               token|            sentence|               lemma|                 pos|          embeddings|                 ner|            entities|
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
-|  0|Contrairement à Q...|[[document, 0, 11...|[[token, 0, 12, C...|[[document, 0, 11...|[[token, 0, 12, C...|[[pos, 0, 12, ADV...|[[word_embeddings...|[[named_entity, 0...|[[chunk, 16, 32, ...|
-|  1|Emmanuel Jean-Mic...|[[document, 0, 27...|[[token, 0, 7, Em...|[[document, 0, 27...|[[token, 0, 7, Em...|[[pos, 0, 7, PROP...|[[word_embeddings...|[[named_entity, 0...|[[chunk, 0, 35, E...|
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
-*/
-
-annotation.select("entities.result").show(false)
-
-/*
-|result                                                                                                          |
-+----------------------------------------------------------------------------------------------------------------+
-|[Quentin Tarantino]                                                                                             |
-|[Emmanuel Jean-Michel Frédéric Macron, Jean-Michel Macron, au CHU d'Amiens4, Françoise Noguès, Sécurité sociale]|
-+----------------------------------------------------------------------------------------------------------------+
-*/
-
-{% endhighlight %}
-
-</div><div class="h3-box" markdown="1">
-
-### French entity_recognizer_lg
-
-{% highlight scala %}
-
-import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
-import com.johnsnowlabs.nlp.SparkNLP
-
-SparkNLP.version()
-
-val pipeline = PretrainedPipeline("entity_recognizer_lg", lang="fr")
-
-val testData = spark.createDataFrame(Seq(
-(1, "Contrairement à Quentin Tarantino, le cinéma français ne repart pas les mains vides de la compétition cannoise."),
-(2, "Emmanuel Jean-Michel Frédéric Macron est le fils de Jean-Michel Macron, né en 1950, médecin, professeur de neurologie au CHU d'Amiens4 et responsable d'enseignement à la faculté de médecine de cette même ville5, et de Françoise Noguès, médecin conseil à la Sécurité sociale")
-)).toDF("id", "text")
-
-val annotation = pipeline.transform(testData)
-
-annotation.show()
-
-/*
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
-| id|                text|            document|               token|            sentence|          embeddings|                 ner|            entities|
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
-|  0|Contrairement à Q...|[[document, 0, 11...|[[token, 0, 12, C...|[[document, 0, 11...|[[word_embeddings...|[[named_entity, 0...|[[chunk, 16, 32, ...|
-|  1|Emmanuel Jean-Mic...|[[document, 0, 27...|[[token, 0, 7, Em...|[[document, 0, 27...|[[word_embeddings...|[[named_entity, 0...|[[chunk, 0, 35, E...|
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
-*/
-
-annotation.select("entities.result").show(false)
-
-/*
-+-------------------------------------------------------------------------------------------------------------+
-|result                                                                                                       |
-+-------------------------------------------------------------------------------------------------------------+
-|[Quentin Tarantino]                                                                                          |
-|[Emmanuel Jean-Michel Frédéric Macron, Jean-Michel Macron, CHU d'Amiens4, Françoise Noguès, Sécurité sociale]|
-+-------------------------------------------------------------------------------------------------------------+
-*/
-
-{% endhighlight %}
-
-</div><div class="h3-box" markdown="1">
-
-### French entity_recognizer_md
-
-{% highlight scala %}
-
-import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
-import com.johnsnowlabs.nlp.SparkNLP
-
-SparkNLP.version()
-
-val pipeline = PretrainedPipeline("entity_recognizer_md", lang="fr")
-
-val testData = spark.createDataFrame(Seq(
-(1, "Contrairement à Quentin Tarantino, le cinéma français ne repart pas les mains vides de la compétition cannoise."),
-(2, "Emmanuel Jean-Michel Frédéric Macron est le fils de Jean-Michel Macron, né en 1950, médecin, professeur de neurologie au CHU d'Amiens4 et responsable d'enseignement à la faculté de médecine de cette même ville5, et de Françoise Noguès, médecin conseil à la Sécurité sociale")
-)).toDF("id", "text")
-
-val annotation = pipeline.transform(testData)
-
-annotation.show()
-
-/*
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
-| id|                text|            document|               token|            sentence|          embeddings|                 ner|            entities|
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
-|  0|Contrairement à Q...|[[document, 0, 11...|[[token, 0, 12, C...|[[document, 0, 11...|[[word_embeddings...|[[named_entity, 0...|[[chunk, 16, 32, ...|
-|  1|Emmanuel Jean-Mic...|[[document, 0, 27...|[[token, 0, 7, Em...|[[document, 0, 27...|[[word_embeddings...|[[named_entity, 0...|[[chunk, 0, 35, E...|
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
-*/
-
-annotation.select("entities.result").show(false)
-
-/*+-------------------------------------------------------------------------------------------------------------+
-|result                                                                                                          |
-+----------------------------------------------------------------------------------------------------------------+
-|[Quentin Tarantino]                                                                                             |
-|[Emmanuel Jean-Michel Frédéric Macron, Jean-Michel Macron, au CHU d'Amiens4, Françoise Noguès, Sécurité sociale]|
-+----------------------------------------------------------------------------------------------------------------+
-*/
-
-{% endhighlight %}
-
-</div><div class="h3-box" markdown="1">
-
-## Italian
-
-{:.table-model-big}
-| Pipelines               | Name                  |
-| ----------------------- | --------------------- |
-| [Explain Document Large](#italian-explain_document_lg)  | `explain_document_lg`  |
-| [Explain Document Medium](#italian-explain_document_md) | `explain_document_md`  |
-| [Entity Recognizer Large](#italian-entity_recognizer_lg) | `entity_recognizer_lg` |
-| [Entity Recognizer Medium](#italian-entity_recognizer_md) | `entity_recognizer_md` |
-
-{:.table-model-big}
-|Feature | Description|
-|---|----|
-|**NER**|Trained by **NerDLApproach** annotator with **Char CNNs - BiLSTM - CRF** and **GloVe Embeddings** on the **WikiNER** corpus and supports the identification of `PER`, `LOC`, `ORG` and `MISC` entities
-|**Lemma**|Trained by **Lemmatizer** annotator on **DXC Technology** dataset
-|**POS**| Trained by **PerceptronApproach** annotator on the [Universal Dependencies](https://universaldependencies.org/treebanks/it_isdt/index.html)
-|**Size**| Model size indicator, **md** and **lg**. The large pipeline uses **glove_840B_300** and the medium uses **glove_6B_300** WordEmbeddings
-
-</div><div class="h3-box" markdown="1">
-
-### Italian explain_document_lg
-
-{% highlight scala %}
-
-import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
-import com.johnsnowlabs.nlp.SparkNLP
-
-SparkNLP.version()
-
-val pipeline = PretrainedPipeline("explain_document_lg", lang="it")
-
-val testData = spark.createDataFrame(Seq(
-(1, "La FIFA ha deciso: tre giornate a Zidane, due a Materazzi"),
-(2, "Reims, 13 giugno 2019 – Domani può essere la giornata decisiva per il passaggio agli ottavi di finale dei Mondiali femminili.")
-)).toDF("id", "text")
-
-val annotation = pipeline.transform(testData)
-
-annotation.show()
-
-/*
-import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
-import com.johnsnowlabs.nlp.SparkNLP
-2.0.8
-pipeline: com.johnsnowlabs.nlp.pretrained.PretrainedPipeline = PretrainedPipeline(explain_document_lg,it,public/models)
-testData: org.apache.spark.sql.DataFrame = [id: int, text: string]
-annotation: org.apache.spark.sql.DataFrame = [id: int, text: string ... 8 more fields]
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
-| id|                text|            document|               token|            sentence|               lemma|                 pos|          embeddings|                 ner|            entities|
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
-|  1|La FIFA ha deciso...|[[document, 0, 56...|[[token, 0, 1, La...|[[document, 0, 56...|[[token, 0, 1, La...|[[pos, 0, 1, DET,...|[[word_embeddings...|[[named_entity, 0...|[[chunk, 3, 6, FI...|
-|  2|Reims, 13 giugno ...|[[document, 0, 12...|[[token, 0, 4, Re...|[[document, 0, 12...|[[token, 0, 4, Re...|[[pos, 0, 4, PROP...|[[word_embeddings...|[[named_entity, 0...|[[chunk, 0, 4, Re...|
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
-*/
-
-annotation.select("entities.result").show(false)
-
-/*
-+-----------------------------------+
-|result                             |
-+-----------------------------------+
-|[FIFA, Zidane, Materazzi]          |
-|[Reims, Domani, Mondiali femminili]|
-+-----------------------------------+
-*/
-
-{% endhighlight %}
-
-</div><div class="h3-box" markdown="1">
-
-### Italian explain_document_md
-
-{% highlight scala %}
-
-import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
-import com.johnsnowlabs.nlp.SparkNLP
-
-SparkNLP.version()
-
-val pipeline = PretrainedPipeline("explain_document_md", lang="it")
-
-val testData = spark.createDataFrame(Seq(
-(1, "La FIFA ha deciso: tre giornate a Zidane, due a Materazzi"),
-(2, "Reims, 13 giugno 2019 – Domani può essere la giornata decisiva per il passaggio agli ottavi di finale dei Mondiali femminili.")
-)).toDF("id", "text")
-
-val annotation = pipeline.transform(testData)
-
-annotation.show()
-
-/*
-import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
-import com.johnsnowlabs.nlp.SparkNLP
-2.0.8
-pipeline: com.johnsnowlabs.nlp.pretrained.PretrainedPipeline = PretrainedPipeline(explain_document_lg,it,public/models)
-testData: org.apache.spark.sql.DataFrame = [id: int, text: string]
-annotation: org.apache.spark.sql.DataFrame = [id: int, text: string ... 8 more fields]
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
-| id|                text|            document|               token|            sentence|               lemma|                 pos|          embeddings|                 ner|            entities|
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
-|  1|La FIFA ha deciso...|[[document, 0, 56...|[[token, 0, 1, La...|[[document, 0, 56...|[[token, 0, 1, La...|[[pos, 0, 1, DET,...|[[word_embeddings...|[[named_entity, 0...|[[chunk, 0, 9, La...|
-|  2|Reims, 13 giugno ...|[[document, 0, 12...|[[token, 0, 4, Re...|[[document, 0, 12...|[[token, 0, 4, Re...|[[pos, 0, 4, PROP...|[[word_embeddings...|[[named_entity, 0...|[[chunk, 0, 4, Re...|
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
-*/
-
-annotation.select("entities.result").show(false)
-
-/*
-+-------------------------------+
-|result                         |
-+-------------------------------+
-|[La FIFA, Zidane, Materazzi]|
-|[Reims, Domani, Mondiali]      |
-+-------------------------------+
-*/
-
-{% endhighlight %}
-
-</div><div class="h3-box" markdown="1">
-
-### Italian entity_recognizer_lg
-
-{% highlight scala %}
-
-import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
-import com.johnsnowlabs.nlp.SparkNLP
-
-SparkNLP.version()
-
-val pipeline = PretrainedPipeline("entity_recognizer_lg", lang="it")
-
-val testData = spark.createDataFrame(Seq(
-(1, "La FIFA ha deciso: tre giornate a Zidane, due a Materazzi"),
-(2, "Reims, 13 giugno 2019 – Domani può essere la giornata decisiva per il passaggio agli ottavi di finale dei Mondiali femminili.")
-)).toDF("id", "text")
-
-val annotation = pipeline.transform(testData)
-
-annotation.show()
-
-/*
-import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
-import com.johnsnowlabs.nlp.SparkNLP
-2.0.8
-pipeline: com.johnsnowlabs.nlp.pretrained.PretrainedPipeline = PretrainedPipeline(explain_document_lg,it,public/models)
-testData: org.apache.spark.sql.DataFrame = [id: int, text: string]
-annotation: org.apache.spark.sql.DataFrame = [id: int, text: string ... 8 more fields]
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
-| id|                text|            document|               token|            sentence|          embeddings|                 ner|            entities|
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
-|  1|La FIFA ha deciso...|[[document, 0, 56...|[[token, 0, 1, La...|[[document, 0, 56...|[[word_embeddings...|[[named_entity, 0...|[[chunk, 3, 6, FI...|
-|  2|Reims, 13 giugno ...|[[document, 0, 12...|[[token, 0, 4, Re...|[[document, 0, 12...|[[word_embeddings...|[[named_entity, 0...|[[chunk, 0, 4, Re...|
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
-*/
-
-annotation.select("entities.result").show(false)
-
-/*
-+-----------------------------------+
-|result                             |
-+-----------------------------------+
-|[FIFA, Zidane, Materazzi]          |
-|[Reims, Domani, Mondiali femminili]|
-+-----------------------------------+
-*/
-
-{% endhighlight %}
-
-</div><div class="h3-box" markdown="1">
-
-### Italian entity_recognizer_md
-
-{% highlight scala %}
-
-import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
-import com.johnsnowlabs.nlp.SparkNLP
-
-SparkNLP.version()
-
-val pipeline = PretrainedPipeline("entity_recognizer_md", lang="it")
-
-val testData = spark.createDataFrame(Seq(
-(1, "La FIFA ha deciso: tre giornate a Zidane, due a Materazzi"),
-(2, "Reims, 13 giugno 2019 – Domani può essere la giornata decisiva per il passaggio agli ottavi di finale dei Mondiali femminili.")
-)).toDF("id", "text")
-
-val annotation = pipeline.transform(testData)
-
-annotation.show()
-
-/*
-import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
-import com.johnsnowlabs.nlp.SparkNLP
-2.0.8
-pipeline: com.johnsnowlabs.nlp.pretrained.PretrainedPipeline = PretrainedPipeline(explain_document_lg,it,public/models)
-testData: org.apache.spark.sql.DataFrame = [id: int, text: string]
-annotation: org.apache.spark.sql.DataFrame = [id: int, text: string ... 8 more fields]
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
-| id|                text|            document|               token|            sentence|          embeddings|                 ner|            entities|
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
-|  1|La FIFA ha deciso...|[[document, 0, 56...|[[token, 0, 1, La...|[[document, 0, 56...|[[word_embeddings...|[[named_entity, 0...|[[chunk, 0, 9, La...|
-|  2|Reims, 13 giugno ...|[[document, 0, 12...|[[token, 0, 4, Re...|[[document, 0, 12...|[[word_embeddings...|[[named_entity, 0...|[[chunk, 0, 4, Re...|
-+---+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
-*/
-
-annotation.select("entities.result").show(false)
-
-/*
-+-------------------------------+
-|result                         |
-+-------------------------------+
-|[La FIFA, Zidane, Materazzi]|
-|[Reims, Domani, Mondiali]      |
-+-------------------------------+
-*/
-
-{% endhighlight %}
-
-</div><div class="h3-box" markdown="1">
-
-## Spanish
-
-{:.table-model-big}
-| Pipeline                 | Name                   | Build  | lang | Description | Offline   |
-|:-------------------------|:-----------------------|:-------|:-------|:----------|:----------|
-| Explain Document Small    | `explain_document_sm`  | 2.4.0 |   `es` |             | [Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/explain_document_sm_es_2.4.0_2.4_1581977077084.zip)  |
-| Explain Document Medium   | `explain_document_md`  | 2.4.0 |   `es` |             | [Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/explain_document_md_es_2.4.0_2.4_1581976836224.zip)  |
-| Explain Document Large    | `explain_document_lg`  | 2.4.0 |   `es` |             | [Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/explain_document_lg_es_2.4.0_2.4_1581975536033.zip)  |
-| Entity Recognizer Small   | `entity_recognizer_sm`  | 2.4.0 |   `es` |             | [Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/entity_recognizer_sm_es_2.4.0_2.4_1581978479912.zip)  |
-| Entity Recognizer Medium  | `entity_recognizer_md`  | 2.4.0 |   `es` |             | [Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/entity_recognizer_md_es_2.4.0_2.4_1581978260094.zip)  |
-| Entity Recognizer Large   | `entity_recognizer_lg`  | 2.4.0 |   `es` |             | [Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/entity_recognizer_lg_es_2.4.0_2.4_1581977172660.zip)  |
-
-{:.table-model-big}
-| Feature   | Description                                                                                                                                                                                            |
-|:----------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| **Lemma** | Trained by **Lemmatizer** annotator on **lemmatization-lists** by `Michal Měchura`                                                                                                                     |
-| **POS**   | Trained by **PerceptronApproach** annotator on the [Universal Dependencies](https://universaldependencies.org/treebanks/es_gsd/index.html)                                                             |
-| **NER**   | Trained by **NerDLApproach** annotator with **Char CNNs - BiLSTM - CRF** and **GloVe Embeddings** on the **WikiNER** corpus and supports the identification of `PER`, `LOC`, `ORG` and `MISC` entities |
-|**Size**| Model size indicator, **sm**, **md**, and **lg**. The small pipelines use **glove_100d**, the medium pipelines use **glove_6B_300**, and large pipelines use **glove_840B_300** WordEmbeddings
-
-</div><div class="h3-box" markdown="1">
-
-## Russian
-
-{:.table-model-big}
-| Pipeline                 | Name                   | Build  | lang | Description | Offline   |
-|:-------------------------|:-----------------------|:-------|:-------|:----------|:----------|
-| Explain Document Small    | `explain_document_sm`  | 2.4.4 |   `ru` |             | [Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/explain_document_sm_ru_2.4.4_2.4_1584017142719.zip)  |
-| Explain Document Medium   | `explain_document_md`  | 2.4.4 |   `ru` |             | [Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/explain_document_md_ru_2.4.4_2.4_1584016917220.zip)  |
-| Explain Document Large    | `explain_document_lg`  | 2.4.4 |   `ru` |             | [Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/explain_document_lg_ru_2.4.4_2.4_1584015824836.zip)  |
-| Entity Recognizer Small   | `entity_recognizer_sm`  | 2.4.4 |   `ru` |             | [Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/entity_recognizer_sm_ru_2.4.4_2.4_1584018543619.zip)  |
-| Entity Recognizer Medium  | `entity_recognizer_md`  | 2.4.4 |   `ru` |             | [Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/entity_recognizer_md_ru_2.4.4_2.4_1584018332357.zip)  |
-| Entity Recognizer Large   | `entity_recognizer_lg`  | 2.4.4 |   `ru` |             | [Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/entity_recognizer_lg_ru_2.4.4_2.4_1584017227871.zip)  |
-
-{:.table-model-big}
-| Feature   | Description                                                                                                                                                                                            |
-|:----------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| **Lemma** | Trained by **Lemmatizer** annotator on the [Universal Dependencies](https://universaldependencies.org/treebanks/ru_gsd/index.html)|
-| **POS**   | Trained by **PerceptronApproach** annotator on the [Universal Dependencies](https://universaldependencies.org/treebanks/ru_gsd/index.html)                                                             |
-| **NER**   | Trained by **NerDLApproach** annotator with **Char CNNs - BiLSTM - CRF** and **GloVe Embeddings** on the **WikiNER** corpus and supports the identification of `PER`, `LOC`, `ORG` and `MISC` entities |
-
-</div><div class="h3-box" markdown="1">
-
-## Dutch
-
-{:.table-model-big}
-| Pipeline                 | Name                   | Build  | lang | Description | Offline   |
-|:-------------------------|:-----------------------|:-------|:-------|:----------|:----------|
-| Explain Document Small    | `explain_document_sm`  | 2.5.0 |   `nl` |           | [Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/explain_document_sm_nl_2.5.0_2.4_1588546621618.zip)  |
-| Explain Document Medium   | `explain_document_md`  | 2.5.0 |   `nl` |           | [Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/explain_document_md_nl_2.5.0_2.4_1588546605329.zip)  |
-| Explain Document Large    | `explain_document_lg`  | 2.5.0 |   `nl` |           | [Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/explain_document_lg_nl_2.5.0_2.4_1588612556770.zip)  |
-| Entity Recognizer Small   | `entity_recognizer_sm`  | 2.5.0 |   `nl` |          | [Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/entity_recognizer_sm_nl_2.5.0_2.4_1588546655907.zip)  |
-| Entity Recognizer Medium  | `entity_recognizer_md`  | 2.5.0 |   `nl` |          | [Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/entity_recognizer_md_nl_2.5.0_2.4_1588546645304.zip)  |
-| Entity Recognizer Large   | `entity_recognizer_lg`  | 2.5.0 |   `nl` |          | [Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/entity_recognizer_lg_nl_2.5.0_2.4_1588612569958.zip)  |
-
-</div><div class="h3-box" markdown="1">
-
-## Norwegian
-
-{:.table-model-big}
-| Pipeline                 | Name                   | Build  | lang | Description | Offline   |
-|:-------------------------|:-----------------------|:-------|:-------|:----------|:----------|
-| Explain Document Small    | `explain_document_sm`  | 2.5.0 |   `no` |           | [Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/explain_document_sm_no_2.5.0_2.4_1588784132955.zip)  |
-| Explain Document Medium   | `explain_document_md`  | 2.5.0 |   `no` |           | [Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/explain_document_md_no_2.5.0_2.4_1588783879809.zip)  |
-| Explain Document Large    | `explain_document_lg`  | 2.5.0 |   `no` |           | [Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/explain_document_lg_no_2.5.0_2.4_1588782610672.zip)  |
-| Entity Recognizer Small   | `entity_recognizer_sm`  | 2.5.0 |   `no` |          | [Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/entity_recognizer_sm_no_2.5.0_2.4_1588794567766.zip)  |
-| Entity Recognizer Medium  | `entity_recognizer_md`  | 2.5.0 |   `no` |          | [Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/entity_recognizer_md_no_2.5.0_2.4_1588794357614.zip)  |
-| Entity Recognizer Large   | `entity_recognizer_lg`  | 2.5.0 |   `no` |          | [Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/entity_recognizer_lg_no_2.5.0_2.4_1588793261642.zip)  |
-
-</div><div class="h3-box" markdown="1">
-
-## Polish
-
-{:.table-model-big}
-| Pipeline                 | Name                   | Build  | lang | Description | Offline   |
-|:-------------------------|:-----------------------|:-------|:-------|:----------|:----------|
-| Explain Document Small    | `explain_document_sm`  | 2.5.0 |   `pl` |           | [Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/explain_document_sm_pl_2.5.0_2.4_1588531081173.zip)  |
-| Explain Document Medium   | `explain_document_md`  | 2.5.0 |   `pl` |           | [Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/explain_document_md_pl_2.5.0_2.4_1588530841737.zip)  |
-| Explain Document Large    | `explain_document_lg`  | 2.5.0 |   `pl` |           | [Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/explain_document_lg_pl_2.5.0_2.4_1588529695577.zip)  |
-| Entity Recognizer Small   | `entity_recognizer_sm`  | 2.5.0 |   `pl` |          | [Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/entity_recognizer_sm_pl_2.5.0_2.4_1588532616080.zip)  |
-| Entity Recognizer Medium  | `entity_recognizer_md`  | 2.5.0 |   `pl` |          | [Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/entity_recognizer_md_pl_2.5.0_2.4_1588532376753.zip)  |
-| Entity Recognizer Large   | `entity_recognizer_lg`  | 2.5.0 |   `pl` |          | [Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/entity_recognizer_lg_pl_2.5.0_2.4_1588531171903.zip)  |
-
-</div><div class="h3-box" markdown="1">
-
-## Portuguese
-
-{:.table-model-big}
-| Pipeline                 | Name                   | Build  | lang | Description | Offline   |
-|:-------------------------|:-----------------------|:-------|:-------|:----------|:----------|
-| Explain Document Small    | `explain_document_sm`  | 2.5.0 |   `pt` |           | [Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/explain_document_sm_pt_2.5.0_2.4_1588501423743.zip)  |
-| Explain Document Medium   | `explain_document_md`  | 2.5.0 |   `pt` |           | [Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/explain_document_md_pt_2.5.0_2.4_1588501189804.zip)  |
-| Explain Document Large    | `explain_document_lg`  | 2.5.0 |   `pt` |           | [Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/explain_document_lg_pt_2.5.0_2.4_1588500056427.zip)  |
-| Entity Recognizer Small   | `entity_recognizer_sm`  | 2.5.0 |   `pt` |          | [Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/entity_recognizer_sm_pt_2.5.0_2.4_1588502815900.zip)  |
-| Entity Recognizer Medium  | `entity_recognizer_md`  | 2.5.0 |   `pt` |          | [Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/entity_recognizer_md_pt_2.5.0_2.4_1588502606198.zip)  |
-| Entity Recognizer Large   | `entity_recognizer_lg`  | 2.5.0 |   `pt` |          | [Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/entity_recognizer_lg_pt_2.5.0_2.4_1588501526324.zip)  |
-
-</div><div class="h3-box" markdown="1">
-
-## Multi-language
-
-{:.table-model-big}
-| Pipeline                 | Name                   | Build  | lang | Description | Offline   |
-|:-------------------------|:-----------------------|:-------|:-------|:----------|:----------|
-| LanguageDetectorDL    | `detect_language_7`        | 2.5.2 |      `xx` |        | [Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/detect_language_7_xx_2.5.0_2.4_1591875676774.zip) |
-| LanguageDetectorDL    | `detect_language_20`        | 2.5.2 |      `xx` |        | [Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/detect_language_20_xx_2.5.0_2.4_1591875683182.zip) |
-
-* The model with 7 languages: Czech, German, English, Spanish, French, Italy, and Slovak
-* The model with 20 languages: Bulgarian, Czech, German, Greek, English, Spanish, Finnish, French, Croatian, Hungarian, Italy, Norwegian, Polish, Portuguese, Romanian, Russian, Slovak, Swedish, Turkish, and Ukrainian
-
-</div><div class="h3-box" markdown="1">
-
-## How to use
-
-### Online
-
-To use Spark NLP pretrained pipelines, you can call `PretrainedPipeline` with pipeline's name and its language (default is `en`):
-
-{% highlight python %}
-
-pipeline = PretrainedPipeline('explain_document_dl', lang='en')
-
-{% endhighlight %}
-
-Same in Scala
-
-{% highlight scala %}
-
-val pipeline = PretrainedPipeline("explain_document_dl", lang="en")
-
-{% endhighlight %}
-
-</div><div class="h3-box" markdown="1">
-
-### Offline
-
-If you have any trouble using online pipelines or models in your environment (maybe it's air-gapped), you can directly download them for `offline` use.
-
-After downloading offline models/pipelines and extracting them, here is how you can use them iside your code (the path could be a shared storage like HDFS in a cluster):
-
-{% highlight scala %}
-val advancedPipeline = PipelineModel.load("/tmp/explain_document_dl_en_2.0.2_2.4_1556530585689/")
-// To use the loaded Pipeline for prediction
-advancedPipeline.transform(predictionDF)
-
-{% endhighlight %}
+#### Please check out our Models Hub for the full list of [pre-trained models](https://sparknlp.org/models) with examples, demo, benchmark, and more
 
 </div>
\ No newline at end of file