-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve call performance for sync_partition_metadata
utility
#18384
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for working on this.
The Benchmark number looks promising, this would be a good improvement for the sync-partition procedure.
I have taken the first pass on PR and added my comments.
These attached Benchmark results are from glue or HMS? Should we do it for both?
presto-hive/src/main/java/com/facebook/presto/hive/SyncPartitionMetadataProcedure.java
Outdated
Show resolved
Hide resolved
...tastore/src/main/java/com/facebook/presto/hive/metastore/SemiTransactionalHiveMetastore.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/com/facebook/presto/hive/SyncPartitionMetadataProcedure.java
Outdated
Show resolved
Hide resolved
60d9881
to
946716a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fgwang7w Thank you for the optimisation. I did the first pass and I have a question regarding one change added in SyncPartitionMetadataProcedure.
presto-hive/src/main/java/com/facebook/presto/hive/SyncPartitionMetadataProcedure.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/com/facebook/presto/hive/SyncPartitionMetadataProcedure.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/com/facebook/presto/hive/SyncPartitionMetadataProcedure.java
Outdated
Show resolved
Hide resolved
946716a
to
b2177f9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have one more suggestion, please let me know what are your views on the same.
presto-hive/src/main/java/com/facebook/presto/hive/SyncPartitionMetadataProcedure.java
Outdated
Show resolved
Hide resolved
b2177f9
to
de12106
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes look good to me. Quick question, Are these attached Benchmark results from glue? Could we update the document with the same? Please update the PR release notes too about the partition batch.
sure, will update the release notes. many thanks for help to review. |
@tdcmeehan @prestodb/committers Hi Presto committers, this PR is reviewed internally already, can we have a second round review? |
7fe6a7c
to
ef1376a
Compare
ef1376a
to
3eec66a
Compare
3eec66a
to
3fecbdc
Compare
...tastore/src/main/java/com/facebook/presto/hive/metastore/SemiTransactionalHiveMetastore.java
Show resolved
Hide resolved
...tastore/src/main/java/com/facebook/presto/hive/metastore/SemiTransactionalHiveMetastore.java
Outdated
Show resolved
Hide resolved
...tastore/src/main/java/com/facebook/presto/hive/metastore/SemiTransactionalHiveMetastore.java
Outdated
Show resolved
Hide resolved
...-hive-metastore/src/main/java/com/facebook/presto/hive/metastore/glue/GlueHiveMetastore.java
Outdated
Show resolved
Hide resolved
...-hive-metastore/src/main/java/com/facebook/presto/hive/metastore/glue/GlueHiveMetastore.java
Outdated
Show resolved
Hide resolved
d1d4e54
to
12d7d8c
Compare
d1c911f
to
9513fd6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fgwang7w George, great improvement! In addition to the comments in code, there are several other nits:
- Commit title for 1 and 3 are too long
- Commit title first letter shall be upper case
- Can you use full sentences in the PR message?
presto-hive/src/main/java/com/facebook/presto/hive/SyncPartitionMetadataProcedure.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/com/facebook/presto/hive/SyncPartitionMetadataProcedure.java
Outdated
Show resolved
Hide resolved
...tastore/src/main/java/com/facebook/presto/hive/metastore/SemiTransactionalHiveMetastore.java
Outdated
Show resolved
Hide resolved
...tastore/src/main/java/com/facebook/presto/hive/metastore/SemiTransactionalHiveMetastore.java
Outdated
Show resolved
Hide resolved
ceb14b8
to
98ec731
Compare
sync_partition_metadata
utility
Thank you @yingsu00 for review. All comments are addressed, could you please give another pass? Thanks! |
98ec731
to
2021afc
Compare
@prestodb/committers ping! |
a0caa9a
to
b9a5dc3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fgwang7w Mostly good, just some nits.
In the PR message and commit messages:
without take into account
-> without taking into account
hms
-> HMS(Hive Meta Store)
Case 2: Sync partition utility lists file status of each partition twice. In case of handling a large of number of partitions for an external table, redundant check file status is a performance bottleneck.
Case 2: Sync partition utility lists file status of each partition twice, causing performance bottleneck when processing an external table with a large of number of partitions.
Commit 1 & 3 titles are too long.
Use upper case for the first letter in each sentence, including all commit titles, commit/PR messages, newly added comments.
presto-hive/src/main/java/com/facebook/presto/hive/SyncPartitionMetadataProcedure.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/com/facebook/presto/hive/SyncPartitionMetadataProcedure.java
Show resolved
Hide resolved
presto-hive/src/main/java/com/facebook/presto/hive/SyncPartitionMetadataProcedure.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/com/facebook/presto/hive/SyncPartitionMetadataProcedure.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/com/facebook/presto/hive/SyncPartitionMetadataProcedure.java
Outdated
Show resolved
Hide resolved
presto-hive-common/src/main/java/com/facebook/presto/hive/HdfsContext.java
Outdated
Show resolved
Hide resolved
presto-hive-common/src/main/java/com/facebook/presto/hive/HdfsContext.java
Outdated
Show resolved
Hide resolved
presto-hive-common/src/main/java/com/facebook/presto/hive/HdfsContext.java
Outdated
Show resolved
Hide resolved
...tastore/src/main/java/com/facebook/presto/hive/metastore/SemiTransactionalHiveMetastore.java
Outdated
Show resolved
Hide resolved
...-hive-metastore/src/main/java/com/facebook/presto/hive/metastore/glue/GlueHiveMetastore.java
Outdated
Show resolved
Hide resolved
Current implementation for getting partition names takes partition values directly without take into account of the actual path. According to the [hive source code](shorturl.at/evT28), this logic may change in the future. This commit updates the partition values with the relative path acquired from the metastore APIs.
Reference to [HMS Repair Utility](shorturl.at/eHIXY), the size for adding partitions can be in a range between 1 and 2,147,483,647. For [Glue metastore](shorturl.at/pqDH9), it’s bounded by 100 for write access.
This PR fixes following issues:
== RELEASE NOTES ==
Detail troubleshoot and solution design can be found in this reference page here