-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SPARK-2083 Add support for spark.local.maxFailures configuration property #1465
Conversation
Can one of the admins verify this patch? |
val MAX_LOCAL_TASK_FAILURES = 1 | ||
|
||
master match { | ||
case "local" => | ||
val scheduler = new TaskSchedulerImpl(sc, MAX_LOCAL_TASK_FAILURES, isLocal = true) | ||
val localTaskFailures = sc.conf.getInt("spark.local.maxFailures", MAX_LOCAL_TASK_FAILURES) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'd rename the variable maxTaskFailures
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
+1, this is important to us for locally testing exception logic before running on a real cluster |
I think there's already a mechanism to set this by using // Regular expression for local[N, maxRetries], used in tests with failing tasks
val LOCAL_N_FAILURES_REGEX = """local\[([0-9]+)\s*,\s*([0-9]+)\]""".r
// ...
case LOCAL_N_FAILURES_REGEX(threads, maxFailures) =>
val scheduler = new TaskSchedulerImpl(sc, maxFailures.toInt, isLocal = true)
val backend = new LocalBackend(scheduler, threads.toInt)
scheduler.initialize(backend)
scheduler |
Can one of the admins verify this patch? |
@JoshRosen You are right, the |
Can one of the admins verify this patch? |
@kbzod Why do we need a separate config for the local case? I think the correct solution is to use the same config, but set a different default value for local mode. Right now this doesn't work because we pass in a hard-coded value of 1, but we can change that to take in Also, can you rebase to master when you have the chance? |
add to whitelist |
If this is being used for testing, I don't see a compelling reason to adding a config over using the constructor. |
I'm going to close this issue as wontfix. |
The logic in
SparkContext
for creating a new task scheduler now looks for a "spark.local.maxFailures" property to specify the number of task failures in a local job that will cause the job to fail. Its default is the prior fixed value of 1 (no retries).The patch includes documentation updates and new unit tests.