-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix check_access_control_on_utilized_columns_only for CTE #24647
Conversation
2dc6fb5
to
ae19df1
Compare
I'm realizing i didn't test for the following cases, and not sure they will work properly
|
facc558
to
3879ae5
Compare
I've updated this to handle multiple ctes that reference each other, and add more explicit handling for falling back to checking all columns when multiple ctes have the same name. Also added a warning for queries that fall back to checking all columns so that it's easier to see when it's happening. Since this is a security vulnerability, i'd rather get the fix in quickly than invest in handling that corner case. |
3879ae5
to
f1f6703
Compare
@@ -526,7 +526,10 @@ public void copyFieldIdsToExploreForWithQuery(WithQuery withQuery) | |||
List<RelationId> relationIds = fieldsToExplore.keySet().stream() | |||
.filter(key -> key.getSourceNode() instanceof Table && ((Table) key.getSourceNode()).getName().equals(name)) | |||
.collect(toImmutableList()); | |||
if (relationIds.size() != 1) { | |||
if (relationIds.isEmpty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In what cases would the relationId would be empty? Is that if the CTE doesn't actually get used in the query or are there other cases as well? Can you add tests?
@rschlussel has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
2 similar comments
@rschlussel has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@rschlussel has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
// TODO(kevintang2022): Fix visitWithQuery so that it looks up relation properly. | ||
// Currently, it just looks the relations using the name (string) of the CTE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some todo for followup. The name conflict is caused by just using name for the look up. We might need to lookup with more info or add a unique identifier to each column
72b6a4b
to
bc4566d
Compare
presto-analyzer/src/main/java/com/facebook/presto/sql/analyzer/RelationId.java
Show resolved
Hide resolved
presto-main/src/test/java/com/facebook/presto/sql/analyzer/TestUtilizedColumnsAnalyzer.java
Show resolved
Hide resolved
presto-main/src/test/java/com/facebook/presto/sql/analyzer/TestUtilizedColumnsAnalyzer.java
Show resolved
Hide resolved
presto-main/src/test/java/com/facebook/presto/sql/analyzer/TestUtilizedColumnsAnalyzer.java
Show resolved
Hide resolved
Previously we weren't associating the with query with the columns used from it in the main part of the query, so we wouldn't consider any columns as used. When the usage within the cte or outside of it was just an identifier, we would have the utilized column collected anyway because the column collected in the main query kept the source table information. However, if there was an expression in between, we would lose that information, and we wouldn't check access control for the column that was used in an expression in the cte. Example: For the following query ``` with cte as (select x as c1, z + 1 as c2 from t13) select c1, c2 from (select * from cte) ``` Previously we would only check access permissions on t13.x, but not t13.z. With this change we will check column access on both 13.x and t13.z. Fix empty relations case Add error message to utilized columns warning. handle ctes used multiple times fixup >1 reference support Add tests for multiple cte with same name Co-authored-by: Kevin Tang <tangk@meta.com> Add comments clarification and test for multiple same name for multiple columns
c4cceec
to
30bc51f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, thanks! let's find someone else to review. (i can't approve because it's my PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a Release note
Description
Previously we weren't associating the with query with the columns used from it in the main part of the query, so we wouldn't consider any columns as used. When the usage within the cte or outside of it was just an identifier, we would have the utilized column collected anyway because the column collected in the main query kept the source table information. However, if there was an expression in between, we would lose that information, and we wouldn't check access control for the column that was used in an expression in the cte.
Example:
For the following query
Previously we would only check access permissions on t13.x, but not t13.z. With this change we will check column access on both 13.x and t13.z.
Motivation and Context
Fix a bug
Impact
Fixes a security hole where we were not checking access for some columns used in a query when the reference is within a with clause.
Test Plan
unit test
Release Notes
Please follow release notes guidelines and fill in the release notes below.