Skip to content

Commit

Permalink
Fix documentation for sparknlp.start() (#14206)
Browse files Browse the repository at this point in the history
  • Loading branch information
DevinTDHa authored Mar 20, 2024
1 parent 286746d commit 1034cc5
Show file tree
Hide file tree
Showing 2 changed files with 23 additions and 13 deletions.
23 changes: 16 additions & 7 deletions docs/en/install.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,18 +72,27 @@ jupyter notebook

</div><div class="h3-box" markdown="1">

#### Start Spark NLP Session from python
#### Start Spark NLP Session from Python

If you need to manually start SparkSession because you have other configurations and `sparknlp.start()` is not including them, you can manually start the SparkSession:
Spark session for Spark NLP can be created (or retrieved) by using `sparknlp.start()`:

```python
import sparknlp
spark = sparknlp.start()
```

If you need to manually start SparkSession because you have other configurations and `sparknlp.start()` is not including them,
you can manually start the SparkSession with:

```python
spark = SparkSession.builder \
.appName("Spark NLP")\
.master("local[*]")\
.config("spark.driver.memory","16G")\
.appName("Spark NLP") \
.master("local[*]") \
.config("spark.driver.memory", "16G") \
.config("spark.serializer", "org.apache.spark.serializer.KryoSerializer") \
.config("spark.kryoserializer.buffer.max", "2000M") \
.config("spark.driver.maxResultSize", "0") \
.config("spark.kryoserializer.buffer.max", "2000M")\
.config("spark.jars.packages", "com.johnsnowlabs.nlp:spark-nlp_2.12:5.3.1")\
.config("spark.jars.packages", "com.johnsnowlabs.nlp:spark-nlp_2.12:5.3.1") \
.getOrCreate()
```

Expand Down
13 changes: 7 additions & 6 deletions python/docs/getting_started/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -130,11 +130,12 @@ you can manually start the SparkSession with:
.. code-block:: python
:substitutions:
spark = SparkSession.builder \
.appName("Spark NLP")\
.master("local[4]")\
.config("spark.driver.memory","16G")\
SparkSession.builder \
.appName("Spark NLP") \
.master("local[*]") \
.config("spark.driver.memory", "16G") \
.config("spark.serializer", "org.apache.spark.serializer.KryoSerializer") \
.config("spark.kryoserializer.buffer.max", "2000M") \
.config("spark.driver.maxResultSize", "0") \
.config("spark.kryoserializer.buffer.max", "2000M")\
.config("spark.jars.packages", "com.johnsnowlabs.nlp:spark-nlp_2.12:|release|")\
.config("spark.jars.packages", "com.johnsnowlabs.nlp:spark-nlp_2.12:|release|") \
.getOrCreate()

0 comments on commit 1034cc5

Please sign in to comment.