-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-23305][SQL][TEST] Test spark.sql.files.ignoreMissingFiles
for all file-based data sources
#20479
Conversation
…t case for ORC
Test build #86940 has finished for PR 20479 at commit
|
new Path(basePath, "third").toString) | ||
|
||
val fs = thirdPath.getFileSystem(spark.sparkContext.hadoopConfiguration) | ||
fs.delete(thirdPath, true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we assert true
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thank you for review and approval! |
LGTM |
Test build #86955 has finished for PR 20479 at commit
|
@@ -655,4 +655,35 @@ class OrcQuerySuite extends OrcQueryTest with SharedSQLContext { | |||
} | |||
} | |||
} | |||
|
|||
testQuietly("Enabling/disabling ignoreMissingFiles") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically, this is copied from Parquet. To avoid duplicate codes, create a common base test class for parquet and orc? Then, we can deduplicate the codes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1. Which suite is proper for that base test class?
@gatorsmile . Can we do the refactoring later since we have still many things to do for RC3? |
Ah, I moved both of them into |
Thank you for review, @HyukjinKwon , @viirya , and @gatorsmile . |
@@ -92,4 +96,39 @@ class FileBasedDataSourceSuite extends QueryTest with SharedSQLContext { | |||
} | |||
} | |||
} | |||
|
|||
// Only ORC/Parquet support this. | |||
Seq("orc", "parquet").foreach { format => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yeah. This sounds better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
@@ -92,4 +96,39 @@ class FileBasedDataSourceSuite extends QueryTest with SharedSQLContext { | |||
} | |||
} | |||
} | |||
|
|||
// Only ORC/Parquet support this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait .. don't other sources support this option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ur, let me check back.
For test-only PRs, we still can merge it to 2.3 |
Test build #86976 has finished for PR 20479 at commit
|
@@ -92,4 +96,37 @@ class FileBasedDataSourceSuite extends QueryTest with SharedSQLContext { | |||
} | |||
} | |||
} | |||
|
|||
allFileBasedDataSources.foreach { format => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much, @HyukjinKwon !
spark.sql.files.ignoreMissingFiles
test case for ORCspark.sql.files.ignoreMissingFiles
test case
spark.sql.files.ignoreMissingFiles
test casespark.sql.files.ignoreMissingFiles
for all data sources
spark.sql.files.ignoreMissingFiles
for all data sourcesspark.sql.files.ignoreMissingFiles
for all file-based data sources
I updated the JIRA issue and PR descriptions to mention all file-based data sources. |
Test build #86982 has finished for PR 20479 at commit
|
Retest this please. |
Test build #87019 has finished for PR 20479 at commit
|
retest this please |
Test build #87022 has finished for PR 20479 at commit
|
Thanks! Merged to master/2.3 |
…r all file-based data sources ## What changes were proposed in this pull request? Like Parquet, all file-based data source handles `spark.sql.files.ignoreMissingFiles` correctly. We had better have a test coverage for feature parity and in order to prevent future accidental regression for all data sources. ## How was this patch tested? Pass Jenkins with a newly added test case. Author: Dongjoon Hyun <dongjoon@apache.org> Closes #20479 from dongjoon-hyun/SPARK-23305. (cherry picked from commit 522e0b1) Signed-off-by: gatorsmile <gatorsmile@gmail.com>
Thank you, @HyukjinKwon and @gatorsmile . |
What changes were proposed in this pull request?
Like Parquet, all file-based data source handles
spark.sql.files.ignoreMissingFiles
correctly. We had better have a test coverage for feature parity and in order to prevent future accidental regression for all data sources.How was this patch tested?
Pass Jenkins with a newly added test case.