-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-20576][SQL] Support generic hint function in Dataset/DataFrame #17839
Conversation
Test build #76410 has started for PR 17839 at commit |
LGTM pending Jenkins |
Actually somebody should add the Python / R wrapper. cc @felixcheung and @zero323 |
LGTM |
I can add both, once it is merged. |
Test build #3683 has finished for PR 17839 at commit
|
just a thought - |
Merging in master/branch-2.2. |
@felixcheung do you worry about conflicts? |
## What changes were proposed in this pull request? We allow users to specify hints (currently only "broadcast" is supported) in SQL and DataFrame. However, while SQL has a standard hint format (/*+ ... */), DataFrame doesn't have one and sometimes users are confused that they can't find how to apply a broadcast hint. This ticket adds a generic hint function on DataFrame that allows using the same hint on DataFrames as well as SQL. As an example, after this patch, the following will apply a broadcast hint on a DataFrame using the new hint function: ``` df1.join(df2.hint("broadcast")) ``` ## How was this patch tested? Added a test case in DataFrameJoinSuite. Author: Reynold Xin <rxin@databricks.com> Closes #17839 from rxin/SPARK-20576. (cherry picked from commit 527fc5d) Signed-off-by: Reynold Xin <rxin@databricks.com>
BTW I filed follow-up tickets for Python/R at https://issues.apache.org/jira/browse/SPARK-20576 |
What changes were proposed in this pull request?
We allow users to specify hints (currently only "broadcast" is supported) in SQL and DataFrame. However, while SQL has a standard hint format (/*+ ... */), DataFrame doesn't have one and sometimes users are confused that they can't find how to apply a broadcast hint. This ticket adds a generic hint function on DataFrame that allows using the same hint on DataFrames as well as SQL.
As an example, after this patch, the following will apply a broadcast hint on a DataFrame using the new hint function:
How was this patch tested?
Added a test case in DataFrameJoinSuite.