Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPARK-6338 [CORE] Use standard temp dir mechanisms in tests to avoid orphaned temp files #5029

Closed
wants to merge 4 commits into from

Conversation

srowen
Copy link
Member

@srowen srowen commented Mar 14, 2015

Use Utils.createTempDir() to replace other temp file mechanisms used in some tests, to further ensure they are cleaned up, and simplify

@SparkQA
Copy link

SparkQA commented Mar 14, 2015

Test build #28614 has started for PR 5029 at commit 1a12efa.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Mar 14, 2015

Test build #28614 has finished for PR 5029 at commit 1a12efa.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28614/
Test FAILed.

@SparkQA
Copy link

SparkQA commented Mar 15, 2015

Test build #28629 has started for PR 5029 at commit 5713d45.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Mar 15, 2015

Test build #28629 has finished for PR 5029 at commit 5713d45.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28629/
Test PASSed.

…in some tests, to further ensure they are cleaned up, and simplify
@SparkQA
Copy link

SparkQA commented Mar 16, 2015

Test build #28647 has started for PR 5029 at commit 57609e4.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Mar 16, 2015

Test build #28647 has finished for PR 5029 at commit 57609e4.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28647/
Test PASSed.

@sryza
Copy link
Contributor

sryza commented Mar 16, 2015

What's the advantage of a parent directory created with createTempDir when we're already using File.createTempFile?

@srowen
Copy link
Member Author

srowen commented Mar 16, 2015

createTempDir registers it for deletion at shutdown.

@@ -370,7 +369,7 @@ class UtilsSuite extends FunSuite with ResetSystemProperties {
assert(sparkConf.getBoolean("spark.test.fileNameLoadA", false) === true)
assert(sparkConf.getInt("spark.test.fileNameLoadB", 1) === 2)
} finally {
outFile.delete()
Utils.deleteRecursively(tmpDir)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary now that we're using createTempDir?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is redundant now.

@sryza
Copy link
Contributor

sryza commented Mar 16, 2015

It seems like in many places we're explicitly deleting the files in addition to using createTempDir. Is that necessary? Otherwise, LGTM.

@@ -136,6 +135,7 @@ class InsertIntoHiveTableSuite extends QueryTest with BeforeAndAfter {
assert(listFolders(tmpDir,List()).sortBy(_.toString()) == expected.sortBy(_.toString))
sql("DROP TABLE table_with_partition")
sql("DROP TABLE tmp_table")
Utils.deleteRecursively(tmpDir)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm torn on this, since it leaves temp files lying around while tests run, and, doesn't delete them if there's a hard crash. But the extra line of code is annoying everywhere and not done consistently. Hm.. Maybe the test superclass can manage a temp dir and delete it between tests. If anyone's interested in that, I'll go there. Otherwise maybe best to leave it as-is and not add recursive deletes where there were none before.

(A few of these changes are to make the temp file go in a temp dir instead of working dir, but most are indeed just about standardization and better guaranteeing cleanup)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to suggest a "TempDirectory" trait or something to manage a temp dir per test. If scalatest runs a single JVM for all tests, then it might be worth it, but it would be a very noisy change.

@srowen
Copy link
Member Author

srowen commented Mar 16, 2015

True, there is not much consistency in whether it is also deleted explicitly after the test. I think I was mostly keeping that as it was in each case

@@ -293,6 +292,8 @@ class ReceiverSuite extends TestSuiteBase with Timeouts with Serializable {
assert(sortedAllLogFiles1.takeRight(1).forall(leftLogFiles1.contains))
assert(sortedAllLogFiles2.takeRight(3).forall(leftLogFiles2.contains))
}

Utils.deleteRecursively(tempDirectory)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant?

@vanzin
Copy link
Contributor

vanzin commented Mar 16, 2015

LGTM (btw you're my hero).

@srowen
Copy link
Member Author

srowen commented Mar 17, 2015

OK, I basically left the code as-is with respect to recursive deletion. Yeah adding a new trait to every test might be too much for the moment. I think this zaps a few minor actual problems (stray temp files) and helps clean up more stuff in more cases, and that leaves us with little practical problem.

@SparkQA
Copy link

SparkQA commented Mar 17, 2015

Test build #28718 has started for PR 5029 at commit 9004081.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Mar 17, 2015

Test build #28718 has finished for PR 5029 at commit 9004081.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28718/
Test PASSed.

@marmbrus
Copy link
Contributor

SQL changes LGTM. I'll note that at the top of TestHive there is more code that calls getTempFilePath. I'm not actually sure if that is using the correct code path or not, but it looks like its trying to do its own cleanup.

@srowen
Copy link
Member Author

srowen commented Mar 18, 2015

Let me see if I can also refactor that bit of code in TestHive too, yeah.

@SparkQA
Copy link

SparkQA commented Mar 19, 2015

Test build #28871 has started for PR 5029 at commit 4a212fa.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Mar 19, 2015

Test build #28871 has finished for PR 5029 at commit 4a212fa.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28871/
Test FAILed.

@SparkQA
Copy link

SparkQA commented Mar 19, 2015

Test build #28884 has started for PR 5029 at commit 27b740a.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Mar 19, 2015

Test build #28884 has finished for PR 5029 at commit 27b740a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28884/
Test PASSed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants