This app runs a fully-managed Spark cluster of an arbitrary size. It also launches a Jupyter instance that is configured to work with the Spark cluster.
To access the Spark cluster in your notebooks simply use the SPARK_MASTER
environment variable:
import os
from pyspark.sql import SparkSession
session = SparkSession.builder.master(os.environ['SPARK_MASTER']).getOrCreate()
sc = session.sparkContext
You are good to go!