Skip to content
This repository has been archived by the owner on Aug 2, 2022. It is now read-only.

Sort aggregation push down #856

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion docs/category.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@
"bash": [
"experiment/ppl/interfaces/endpoint.rst",
"experiment/ppl/interfaces/protocol.rst",
"experiment/ppl/admin/settings.rst"
"experiment/ppl/admin/settings.rst",
"user/optimization/optimization.rst"
],
"ppl_cli": [
"experiment/ppl/cmd/dedup.rst",
Expand Down
12 changes: 11 additions & 1 deletion docs/experiment/ppl/admin/settings.rst
Original file line number Diff line number Diff line change
Expand Up @@ -129,7 +129,7 @@ Notes: This setting will impact the correctness of the aggregation operation, fo
Example
-------

PPL query::
Change the size_limit to 1000::

sh$ curl -sS -H 'Content-Type: application/json' \
... -X PUT localhost:9200/_cluster/settings \
Expand All @@ -146,3 +146,13 @@ PPL query::
"transient": {}
}

Rollback to default value::

sh$ curl -sS -H 'Content-Type: application/json' \
... -X PUT localhost:9200/_cluster/settings \
... -d '{"persistent" : {"opendistro.query.size_limit" : null}}'
{
"acknowledged": true,
"persistent": {},
"transient": {}
}
6 changes: 5 additions & 1 deletion docs/experiment/ppl/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -62,8 +62,12 @@ The query start with search command and then flowing a set of command delimited

- `PPL Functions <../../user/dql/functions.rst>`_

* **Optimization**

- `Optimization <../../user/optimization/optimization.rst>`_

* **Language Structure**

- `Identifiers <general/identifiers.rst>`_

- `Data Types <general/datatypes.rst>`_
- `Data Types <general/datatypes.rst>`_
4 changes: 4 additions & 0 deletions docs/user/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,10 @@ Open Distro for Elasticsearch SQL enables you to extract insights out of Elastic

- `Full-text Search <beyond/fulltext.rst>`_

* **Optimization**

- `Optimization <optimization/optimization.rst>`_

* **Troubleshooting**

- `Troubleshooting <dql/troubleshooting.rst>`_
Expand Down
16 changes: 1 addition & 15 deletions docs/user/limitations/limitations.rst
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ Limitations on JOINs
JOIN does not support aggregations on the joined result. The `join` query does not support aggregations on the joined result.
For example, e.g. `SELECT depo.name, avg(empo.age) FROM empo JOIN depo WHERE empo.id == depo.id GROUP BY depo.name` is not supported.

Here's a link to the Github issue - [Issue 110](https://github.com/opendistro-for-elasticsearch/sql/issues/110).
Here's a link to the Github issue - `Issue 110 <https://github.com/opendistro-for-elasticsearch/sql/issues/110>`_.


Limitations on Window Functions
Expand Down Expand Up @@ -101,17 +101,3 @@ The response in JDBC format with cursor id::
}

The query with `aggregation` and `join` does not support pagination for now.


Limitations on Query Optimizations
==================================

Multi-fields in WHERE Conditions
--------------------------------

The filter expressions in ``WHERE`` clause may be pushed down to Elasticsearch DSL queries to avoid large amounts of data retrieved. In this case, for Elasticsearch multi-field (a text field with another keyword field inside), assumption is made that the keyword field name is always "keyword" which is true by default.

Multiple Window Functions
-------------------------

At the moment there is no optimization to merge similar sort operators to avoid unnecessary sort. In this case, only one sort operator associated with window function will be pushed down to Elasticsearch DSL queries. Others will sort the intermediate results in memory and return to its window operator in the upstream. This cost can be avoided by optimization aforementioned though in-memory sorting operation can still happen. Therefore a custom circuit breaker is in use to monitor sort operator and protect memory usage.
Loading