Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-14596][SQL] Remove not used SqlNewHadoopRDD and some more unused imports #12354

Closed
wants to merge 5 commits into from

Conversation

HyukjinKwon
Copy link
Member

What changes were proposed in this pull request?

Old HadoopFsRelation API includes buildInternalScan() which uses SqlNewHadoopRDD in ParquetRelation.
Because now the old API is removed, SqlNewHadoopRDD is not used anymore.

So, this PR removes SqlNewHadoopRDD and several unused imports.

This was discussed in #12326.

How was this patch tested?

Several related existing unit tests and sbt scalastyle.

@SparkQA
Copy link

SparkQA commented Apr 13, 2016

Test build #55707 has finished for PR 12354 at commit 8b8e961.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member Author

cc @cloud-fan

@cloud-fan
Copy link
Contributor

There is a SqlNewHadoopRDDState, we should rename it to FileScanRDDState

@HyukjinKwon
Copy link
Member Author

Thanks! I just renamed.

@SparkQA
Copy link

SparkQA commented Apr 13, 2016

Test build #55708 has finished for PR 12354 at commit a291332.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 13, 2016

Test build #55709 has finished for PR 12354 at commit 3f8e878.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -220,8 +220,8 @@ class HadoopRDD[K, V](

// Sets the thread local variable for the file's name
split.inputSplit.value match {
case fs: FileSplit => SqlNewHadoopRDDState.setInputFileName(fs.getPath.toString)
case _ => SqlNewHadoopRDDState.unsetInputFileName()
case fs: FileSplit => FileScanRDDState.setInputFileName(fs.getPath.toString)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liancheng @yhuai Do you still use HadoopRDD to read data source relation? If not, I think we don't need to update file name here anymore.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is used by HiveTableScan

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i see, how about renaming it to InputFileNameHolder?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Thanks.

@@ -20,8 +20,7 @@ package org.apache.spark.rdd
import org.apache.spark.unsafe.types.UTF8String

/**
* State for SqlNewHadoopRDD objects. This is split this way because of the package splits.
* TODO: Move/Combine this with org.apache.spark.sql.datasources.SqlNewHadoopRDD
* State for FileScanRDD objects. This is split this way because of the package splits.
*/
private[spark] object SqlNewHadoopRDDState {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SqlNewHadoopRDDState is definitely not a good name here, how about InputFileNameHolder?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Thanks!

@HyukjinKwon
Copy link
Member Author

@cloud-fan The commits I just submitted include the changes for MiMa tests and some comments.

@SparkQA
Copy link

SparkQA commented Apr 14, 2016

Test build #55772 has finished for PR 12354 at commit 6cb3547.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 14, 2016

Test build #55769 has finished for PR 12354 at commit 9c5893d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member Author

retest this please

@SparkQA
Copy link

SparkQA commented Apr 14, 2016

Test build #55784 has finished for PR 12354 at commit 6cb3547.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

thanks! merging to master!

@asfgit asfgit closed this in b481940 Apr 14, 2016
@HyukjinKwon HyukjinKwon deleted the SPARK-14596 branch January 2, 2018 03:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants