-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-30579][DOC] Document ORDER BY Clause of SELECT statement in SQL Reference #27288
Conversation
Test build #117094 has finished for PR 27288 at commit
|
retest this please |
Test build #117101 has finished for PR 27288 at commit
|
Test build #117137 has finished for PR 27288 at commit
|
@@ -18,5 +18,121 @@ license: | | |||
See the License for the specific language governing permissions and | |||
limitations under the License. | |||
--- | |||
The <code>ORDER BY</code> clause is used to return the result rows in a sorted manner | |||
in the user specified order. Unlike the <code>SORT BY</code> clause, this clause guarantees |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
link to SORT BY
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@huaxingao will do it in the finalization pr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK if you want to take care of that at the end
<dl> | ||
<dt><code><em>ORDER BY</em></code></dt> | ||
<dd> | ||
Specifies a comma separated list of expressions along with optional parameters <code>sort_direction</code> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comma separated
-> comma-separated
?
I caught this because I saw Sean's comment in another PR :)
Actually in many places we have comma separated
, for example,
An optional parameter that specifies a comma separated list of key and value pairs for partitions.
Not sure if I should open a MINOR PR to fix all of them.
Sure.. will do. |
Test build #117156 has finished for PR 27288 at commit
|
@@ -18,5 +18,121 @@ license: | | |||
See the License for the specific language governing permissions and | |||
limitations under the License. | |||
--- | |||
The <code>ORDER BY</code> clause is used to return the result rows in a sorted manner | |||
in the user specified order. Unlike the <code>SORT BY</code> clause, this clause guarantees |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK if you want to take care of that at the end
Optionally specifies whether to sort the rows in ascending (lowest to highest) or descending | ||
(highest to lowest) order. The valid values for sort direction are <code>ASC</code> for ascending | ||
and <code>DESC</code> for descending. If sort direction is not explicitly specified then by default | ||
rows are sorted in ascending manner. <br><br> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just "sorted ascending"
sort direction. In Spark, NULL values are considered to be lower than any non-NULL values by default. | ||
Therefore the ordering of NULL values depend on the sort direction.<br><br> | ||
<ol> | ||
<li>If the sort order is ASC, NULLS are returned first; to force NULLS to be last, use NULLS LAST</li> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might turn this around, to emphasize the null sort order: "If NULLS FIRST
(the default), then NULLs are returned first if sort order is ASC
, and last if sort order is DESC
" and likewise for the second items.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@srowen Hmmn.. trying to think if we are conveying properly with the above re-wording. So
- If if NULLS FIRST is explicitly specified, you would ALWAYS see NULLs at the top of your resultset no matter what the sort order is.
- if NULLS LAST is explicitly specified , you would ALWAYS see NULLS at the end no matter what the sort order is
.What do you think ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, my mistake, is that how it works? so NULLS FIRST puts nulls first no matter what the sort order? Maybe we can take the focus off of the sort order then. For example NULLS LAST does the same thing no matter what the sort order.
Maybe like: "NULLS FIRST forces NULLs to sort before all non-NULL values, regardless of the sort order" and likewise for NULLS LAST, and then explain that, if not specified, NULLs sort first if ASC and last if DESC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@srowen Got it.. Will make the change.
Test build #117263 has finished for PR 27288 at commit
|
@@ -18,5 +18,125 @@ license: | | |||
See the License for the specific language governing permissions and | |||
limitations under the License. | |||
--- | |||
The <code>ORDER BY</code> clause is used to return the result rows in a sorted manner | |||
in the user specified order. Unlike the <code>SORT BY</code> clause, this clause guarantees | |||
total order in the output. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
total order
-> a total order
?
</dd> | ||
<dt><code><em>sort_direction</em></code></dt> | ||
<dd> | ||
Optionally specifies whether to sort the rows in ascending (lowest to highest) or descending |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need the statement (lowest to highest)
? ascending or descending
is enough?
<dt><code><em>sort_direction</em></code></dt> | ||
<dd> | ||
Optionally specifies whether to sort the rows in ascending (lowest to highest) or descending | ||
(highest to lowest) order. The valid values for sort direction are <code>ASC</code> for ascending |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sort direction
-> the sort direction
?
Test build #117280 has finished for PR 27288 at commit
|
Test build #117292 has finished for PR 27288 at commit
|
Test build #117297 has finished for PR 27288 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maropu are you OK with this? looks OK to me. If so we can move on to ORDER BY.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it looks fine. Thanks, @dilipbiswal !
Thanks! Merged to master. |
@dilipbiswal ok, lets move on to SORT BY. |
Thanks a lot @srowen @huaxingao @maropu @maropu Sure.. Let me double check on SORT by. |
Thanks alot! |
What changes were proposed in this pull request?
Document ORDER BY clause of SELECT statement in SQL Reference Guide.
Why are the changes needed?
Currently Spark lacks documentation on the supported SQL constructs causing
confusion among users who sometimes have to look at the code to understand the
usage. This is aimed at addressing this issue.
Does this PR introduce any user-facing change?
Yes.
Before:
There was no documentation for this.
After.
data:image/s3,"s3://crabby-images/2c753/2c7534c519db96fa8f61da51502ccbe1918c7a85" alt="Screen Shot 2020-01-19 at 11 50 57 PM"
data:image/s3,"s3://crabby-images/89b79/89b794c3af58305354d13678f6dd771607e98774" alt="Screen Shot 2020-01-19 at 11 51 14 PM"
data:image/s3,"s3://crabby-images/aab28/aab2818c04c6eaa272b704dd716e2dfa068cac27" alt="Screen Shot 2020-01-19 at 11 51 33 PM"
How was this patch tested?
Tested using jykyll build --serve