-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-20926][SQL] Removing exposures to guava library caused by directly accessing SessionCatalog's tableRelationCache #18148
Conversation
…ly accessing SessionCatalog's tableRelationCache There were test failures because DataStorageStrategy, HiveMetastoreCatalog and also HiveSchemaInferenceSuite were exposed to the shaded Guava library. This change removes those exposures by introducing new methods in SessionCatalog.
ok to test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marmbrus as far as I can tell, SessionCatalog
is a public API right? If so we shouldn't ship 2.2 without fixing this, since it's exposing Guava.
* A cache of qualified table names to table relation plans. | ||
* Accessing tableRelationCache directly is not recommended, | ||
* since it will introduce exposures to guava libraries. | ||
*/ | ||
val tableRelationCache: Cache[QualifiedTableName, LogicalPlan] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to have been introduced in 2.2, can you make it private instead, and change all call sites?
Test build #77659 has finished for PR 18148 at commit
|
val tableRelationCache: Cache[QualifiedTableName, LogicalPlan] = { | ||
val cacheSize = conf.tableRelationCacheSize | ||
CacheBuilder.newBuilder().maximumSize(cacheSize).build[QualifiedTableName, LogicalPlan]() | ||
} | ||
|
||
/** | ||
* This method provides a way to get a cached plan |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "without exposing components to Guava" part of all these comments is unnecessary.
@vanzin the whole catalyst project is private. |
Ah, yes. I read 'filter' instead of 'filterNot' in SparkBuild.scala where it sets |
Test build #77664 has finished for PR 18148 at commit
|
Test build #77666 has finished for PR 18148 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of suggestions for the names but otherwise LGTM.
} | ||
|
||
/** This method provides a way to get a cached plan if the key exists. */ | ||
def getCachedTableIfPresent(key: QualifiedTableName): LogicalPlan = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getCachedTable
instead. I'd even just use cachedTable
but this class at least seems to use verbs more than the rest of the code.
} | ||
|
||
/** This method provides a way to cache a plan. */ | ||
def putTableInCache(t: QualifiedTableName, l: LogicalPlan): Unit = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cacheTable
Test build #77696 has finished for PR 18148 at commit
|
Not clear why the tests were failed. |
Fun, but probably unrelated:
|
retest this please |
Test build #77704 has finished for PR 18148 at commit
|
Merging to master. |
…ctly accessing SessionCatalog's tableRelationCache There could be test failures because DataStorageStrategy, HiveMetastoreCatalog and also HiveSchemaInferenceSuite were exposed to guava library by directly accessing SessionCatalog's tableRelationCacheg. These failures occur when guava shading is in place. ## What changes were proposed in this pull request? This change removes those guava exposures by introducing new methods in SessionCatalog and also changing DataStorageStrategy, HiveMetastoreCatalog and HiveSchemaInferenceSuite so that they use those proxy methods. ## How was this patch tested? Unit tests passed after applying these changes. Author: Reza Safi <rezasafi@cloudera.com> Closes #18148 from rezasafi/branch-2.2.
@rezasafi please close this. |
Oh crap, I didn't notice the branch. @rezasafi in the future, always send PRs against the master branch first. |
Sorry about this @vanzin. I didn't know that. |
@vanzin Seems merging to branch-2.2 was an accident? Since it is not really a bug fix, should we revert it from branch-2.2 and just keep it in the master? |
It was an accident but it shouldn't cause any harm either. |
There could be test failures because DataStorageStrategy, HiveMetastoreCatalog and also HiveSchemaInferenceSuite were exposed to guava library by directly accessing SessionCatalog's tableRelationCacheg. These failures occur when guava shading is in place.
What changes were proposed in this pull request?
This change removes those guava exposures by introducing new methods in SessionCatalog and also changing DataStorageStrategy, HiveMetastoreCatalog and HiveSchemaInferenceSuite so that they use those proxy methods.
How was this patch tested?
Unit tests passed after applying these changes.