Skip to content

Commit

Permalink
Address comments
Browse files Browse the repository at this point in the history
  • Loading branch information
HyukjinKwon committed Nov 7, 2019
1 parent 97fa953 commit 9e2d832
Show file tree
Hide file tree
Showing 3 changed files with 29 additions and 2 deletions.
2 changes: 1 addition & 1 deletion docs/job-scheduling.md
Original file line number Diff line number Diff line change
Expand Up @@ -303,5 +303,5 @@ However, currently it cannot inherit the local properties from the parent thread
each thread with its own local properties. To work around this, you should manually copy and set the
local properties from the parent thread to the child thread when you create another thread in PVM.

Note that `PYSPARK_PIN_THREAD` is currently experiemtnal and not recommended for use in production.
Note that `PYSPARK_PIN_THREAD` is currently experimental and not recommended for use in production.

27 changes: 27 additions & 0 deletions python/pyspark/context.py
Original file line number Diff line number Diff line change
Expand Up @@ -1009,6 +1009,15 @@ def setJobGroup(self, groupId, description, interruptOnCancel=False):
in Thread.interrupt() being called on the job's executor threads. This is useful to help
ensure that the tasks are actually stopped in a timely manner, but is off by default due
to HDFS-1208, where HDFS may respond to Thread.interrupt() by marking nodes as dead.
.. note:: Currently, setting a group ID (set to local properties) with a thread does
not properly work. Internally threads on PVM and JVM are not synced, and JVM thread
can be reused for multiple threads on PVM, which fails to isolate local properties
for each thread on PVM. To work around this, you can set `PYSPARK_PIN_THREAD` to
`'true'` (see SPARK-22340). However, note that it cannot inherit the local properties
from the parent thread although it isolates each thread on PVM and JVM with its own
local properties. To work around this, you should manually copy and set the local
properties from the parent thread to the child thread when you create another thread.
"""
warnings.warn(
"Currently, setting a group ID (set to local properties) with a thread does "
Expand All @@ -1031,6 +1040,15 @@ def setLocalProperty(self, key, value):
"""
Set a local property that affects jobs submitted from this thread, such as the
Spark fair scheduler pool.
.. note:: Currently, setting a local property with a thread does
not properly work. Internally threads on PVM and JVM are not synced, and JVM thread
can be reused for multiple threads on PVM, which fails to isolate local properties
for each thread on PVM. To work around this, you can set `PYSPARK_PIN_THREAD` to
`'true'` (see SPARK-22340). However, note that it cannot inherit the local properties
from the parent thread although it isolates each thread on PVM and JVM with its own
local properties. To work around this, you should manually copy and set the local
properties from the parent thread to the child thread when you create another thread.
"""
warnings.warn(
"Currently, setting a local property with a thread does not properly work. "
Expand Down Expand Up @@ -1058,6 +1076,15 @@ def getLocalProperty(self, key):
def setJobDescription(self, value):
"""
Set a human readable description of the current job.
.. note:: Currently, setting a job description (set to local properties) with a thread does
not properly work. Internally threads on PVM and JVM are not synced, and JVM thread
can be reused for multiple threads on PVM, which fails to isolate local properties
for each thread on PVM. To work around this, you can set `PYSPARK_PIN_THREAD` to
`'true'` (see SPARK-22340). However, note that it cannot inherit the local properties
from the parent thread although it isolates each thread on PVM and JVM with its own
local properties. To work around this, you should manually copy and set the local
properties from the parent thread to the child thread when you create another thread.
"""
warnings.warn(
"Currently, setting a job description (set to local properties) with a thread does "
Expand Down
2 changes: 1 addition & 1 deletion python/pyspark/java_gateway.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@
if sys.version >= '3':
xrange = range

from py4j.java_gateway import java_import, JavaObject, JavaGateway, GatewayParameters
from py4j.java_gateway import java_import, JavaGateway, JavaObject, GatewayParameters
from py4j.clientserver import ClientServer, JavaParameters, PythonParameters
from pyspark.find_spark_home import _find_spark_home
from pyspark.serializers import read_int, write_with_length, UTF8Deserializer
Expand Down

0 comments on commit 9e2d832

Please sign in to comment.