-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make HashJoinExec::join_schema public #12807
Conversation
It is needed by physical optimizers that want to replace the HashJoin with a different type of join, as they need to replace it with an equivalent projection, but HashJoinExec::projection could not be used to build it because it refers to indices in HashJoinExec::join_schema.
@@ -306,7 +306,7 @@ pub struct HashJoinExec { | |||
pub join_type: JoinType, | |||
/// The schema after join. Please be careful when using this schema, | |||
/// if there is a projection, the schema isn't the same as the output schema. | |||
join_schema: SchemaRef, | |||
pub join_schema: SchemaRef, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense to make a new "getter" method and make that public, so we can better control invariance / future changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree a method would be better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @progval
Are these changes tested?
no (should they?)
I don't think we need a test for this particular change given how small it is/
What would be awesome would be some example showing how to implement a join reorder pass (the actual algorithm in the example could be trivial to swap the orders or something) but showing the structure would be super helpful
@@ -306,7 +306,7 @@ pub struct HashJoinExec { | |||
pub join_type: JoinType, | |||
/// The schema after join. Please be careful when using this schema, | |||
/// if there is a projection, the schema isn't the same as the output schema. | |||
join_schema: SchemaRef, | |||
pub join_schema: SchemaRef, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree a method would be better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my opinion this is better than what is on main so we can merge it, but it would be better to make a function / example.
Thanks @progval and @Dandandan
Which issue does this PR close?
Closes #12806.
Rationale for this change
It is needed by physical optimizers that want to replace the HashJoin with a different type of join, as they need to replace it with an equivalent projection, but HashJoinExec::projection could not be used to build it because it refers to indices in HashJoinExec::join_schema.
What changes are included in this PR?
Makes HashJoinExec::join_schema public
Are these changes tested?
no (should they?)
Are there any user-facing changes?
new field in the API, already documented