Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPARK-2083 Add support for spark.local.maxFailures configuration property #1465

Closed
wants to merge 2 commits into from
Closed

Conversation

bhavanki
Copy link

The logic in SparkContext for creating a new task scheduler now looks for a "spark.local.maxFailures" property to specify the number of task failures in a local job that will cause the job to fail. Its default is the prior fixed value of 1 (no retries).

The patch includes documentation updates and new unit tests.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

val MAX_LOCAL_TASK_FAILURES = 1

master match {
case "local" =>
val scheduler = new TaskSchedulerImpl(sc, MAX_LOCAL_TASK_FAILURES, isLocal = true)
val localTaskFailures = sc.conf.getInt("spark.local.maxFailures", MAX_LOCAL_TASK_FAILURES)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd rename the variable maxTaskFailures

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

@roji
Copy link

roji commented Aug 10, 2014

+1, this is important to us for locally testing exception logic before running on a real cluster

@JoshRosen
Copy link
Contributor

I think there's already a mechanism to set this by using local[N, maxFailures] to create your SparkContext:

// Regular expression for local[N, maxRetries], used in tests with failing tasks
     val LOCAL_N_FAILURES_REGEX = """local\[([0-9]+)\s*,\s*([0-9]+)\]""".r

// ...

case LOCAL_N_FAILURES_REGEX(threads, maxFailures) =>
         val scheduler = new TaskSchedulerImpl(sc, maxFailures.toInt, isLocal = true)
         val backend = new LocalBackend(scheduler, threads.toInt)
         scheduler.initialize(backend)
         scheduler

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@bhavanki
Copy link
Author

@JoshRosen You are right, the local[N, maxfailures] mechanism works already, but the filer of SPARK-2083 stated that "it is not documented and hard to manage", and suggested a patch like this one.

@SparkQA
Copy link

SparkQA commented Sep 5, 2014

Can one of the admins verify this patch?

@andrewor14
Copy link
Contributor

@kbzod Why do we need a separate config for the local case? I think the correct solution is to use the same config, but set a different default value for local mode. Right now this doesn't work because we pass in a hard-coded value of 1, but we can change that to take in spark.task.maxFailures instead (and then default to 1).

Also, can you rebase to master when you have the chance?

@andrewor14
Copy link
Contributor

add to whitelist

@pwendell
Copy link
Contributor

If this is being used for testing, I don't see a compelling reason to adding a config over using the constructor.

@pwendell
Copy link
Contributor

I'm going to close this issue as wontfix.

@asfgit asfgit closed this in f73b56f Nov 10, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants