Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Iceberg]Support view on Rest catalog and Nessie catalog #23793

Merged

Conversation

hantangwangd
Copy link
Member

@hantangwangd hantangwangd commented Oct 9, 2024

Description

Iceberg's RestCatalog and NessieCatalog have implemented ViewCatalog interface, that means we can support operations on view for Iceberg connector configured with these catalogs. This PR refactor IcebergNativeMetadata, so that catalogs like REST and NESSIE that implements interface ViewCatalog can now support operations on view.

Motivation and Context

Support view for Iceberg connector on as many catalog types as possible

Impact

Iceberg connector configured with REST and NESSIE catalogs can now support operations on view

Test Plan

  • Enable existing test cases for views on REST and NESSIE

Contributor checklist

  • Please make sure your submission complies with our development, formatting, commit message, and attribution guidelines.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

== RELEASE NOTES ==

Iceberg Connector Changes
* Add support of ``view`` for Iceberg connector when configured with ``REST`` and ``NESSIE``. :pr:`23793`

@tdcmeehan tdcmeehan self-assigned this Oct 9, 2024
@hantangwangd hantangwangd force-pushed the support_view_on_rest_nessie branch 2 times, most recently from aa7987a to 6975145 Compare October 10, 2024 03:12
@hantangwangd hantangwangd marked this pull request as ready for review October 10, 2024 03:47
@hantangwangd hantangwangd requested review from ZacBlanco and a team as code owners October 10, 2024 03:47
else {
columns.add(PATH_COLUMN_METADATA);
columns.add(DATA_SEQUENCE_NUMBER_COLUMN_METADATA);
catch (NoSuchTableException e) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of catching this exception, does it make sense to add a new method, isIcebergView, then just load the view directly if it's true?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it seems more natural this way. But I did not find an efficient way to implement method isIcebergView. ViewCatalog in Iceberg library provide a method boolean viewExists(TableIdentifier identifier), based on this method, we can implement isIcebergView as follows:

boolean isIcebergView(......, SchemaTableName table) {
    ......

    if (catalog instanceof ViewCatalog) {
        return ((ViewCatalog) catalog).viewExists(toIcebergTableIdentifier(table));
    }

    return false;
}

However, when we examine ViewCatalog.viewExists(identifier) provided by Iceberg library, we find that its implementation is based on loadView(idenfifier) as well:

boolean viewExists(TableIdentifier identifier) {
    try {
      loadView(identifier);
      return true;
    } catch (NoSuchViewException e) {
      return false;
    }
  }

Considering this situation, the above way doesn't seem very efficient, it will do a lot of additional view loading, whether we are querying tables or views.

So the way of capturing exceptions may not seem so natural, but it's an efficient way as far as I can think of. What do you think, which one is more preferable? And open for any other thoughts.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, thanks for the explanation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some of this information as a comment so folks who read this code later will understand this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, the comment has been added. Please take a look when available, thanks.

@hantangwangd hantangwangd force-pushed the support_view_on_rest_nessie branch from 6975145 to c1c100d Compare November 3, 2024 09:39
@steveburnett
Copy link
Contributor

Thanks for the release note entry! Nit rephrase suggestion to follow the Order of changes in the Release Notes Guideline.

== RELEASE NOTES ==

Iceberg Connector Changes
* Add support of `view` for Iceberg connector when configured with `REST` and `NESSIE`. :pr:`23793`

Copy link
Contributor

@tdcmeehan tdcmeehan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple nits, otherwise LGTM

@hantangwangd hantangwangd force-pushed the support_view_on_rest_nessie branch 2 times, most recently from 3b01308 to addb7c9 Compare November 4, 2024 17:25
@hantangwangd
Copy link
Member Author

@steveburnett Thanks for the suggestion, fixed! Please take a look when available.

@steveburnett
Copy link
Contributor

@steveburnett Thanks for the suggestion, fixed! Please take a look when available.

LGTM, thanks!

Copy link
Contributor

@tdcmeehan tdcmeehan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some other very small nits, LGTM

Copy link
Contributor

@kiersten-stokes kiersten-stokes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! I know this is a sought-after feature in the community. My only suggestion would be to add a blurb to the Iceberg connector documentation, since right now it only indicates view availability in Hive/Glue

@hantangwangd
Copy link
Member Author

@kiersten-stokes Thanks for your suggestion, I have supplemented the Iceberg document, please take a look when available.

@hantangwangd
Copy link
Member Author

Some other very small nits, LGTM

Comments has been fixed, please take a look when available, thanks! @tdcmeehan

steveburnett
steveburnett previously approved these changes Nov 5, 2024
Copy link
Contributor

@steveburnett steveburnett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! (docs)

Pull branch, review local doc build. Thanks!

kiersten-stokes
kiersten-stokes previously approved these changes Nov 5, 2024
Copy link
Contributor

@kiersten-stokes kiersten-stokes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you

@steveburnett
Copy link
Contributor

Noticed a few nits in formatting for the release note entry - tiny, but fixing it now saves time when the next release note PR is done.

== RELEASE NOTES ==

Iceberg Connector Changes
* Add support of ``view`` for Iceberg connector when configured with ``REST`` and ``NESSIE``. :pr:`23793`

@hantangwangd
Copy link
Member Author

Noticed a few nits in formatting for the release note entry - tiny, but fixing it now saves time when the next release note PR is done.

== RELEASE NOTES ==

Iceberg Connector Changes
* Add support of ``view`` for Iceberg connector when configured with ``REST`` and ``NESSIE``. :pr:`23793`

Fixed, thanks for your suggestion. @steveburnett

Copy link
Contributor

@ZacBlanco ZacBlanco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

two minor things

@hantangwangd hantangwangd force-pushed the support_view_on_rest_nessie branch from 092573e to 9557bf9 Compare November 5, 2024 18:20
@hantangwangd hantangwangd merged commit 91f140c into prestodb:master Nov 6, 2024
58 checks passed
@hantangwangd hantangwangd deleted the support_view_on_rest_nessie branch November 6, 2024 23:28
@hantangwangd
Copy link
Member Author

Hi Zac @ZacBlanco, after recheck the code, I confirm that an iceberg Table or View instance could be a big object in memory, as it has a dedicated TableOperations instance which holds the current TableMetadata parsed from table metadata file, this could be somewhat large. However, ConnectorMetadata is an object whose lifespan is within the scope of a transaction, which means IcebergAbstractMetadata instances are not global. So it seems that we needn't realize a cache with max entries count limit for tables and views in IcebergAbstractMetadata, since only the ones that are used in current transaction would be maintained, the number is unlikely to be very much. What do you think, is it reasonable?

@ZacBlanco
Copy link
Contributor

I think it is reasonable to not have a cache then

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants