From a494dccabfa2bc13441c6e0b1398f8f9b0e892fb Mon Sep 17 00:00:00 2001 From: Igor Lukanin Date: Tue, 9 Apr 2024 23:53:15 +0200 Subject: [PATCH] docs: Extract the page for "matching pre-aggregations", add the page for "querying the data model", document `CUBEJS_DB_QUERY_LIMIT` (#8100) --- .../product/apis-integrations/graphql-api.mdx | 4 +- .../apis-integrations/javascript-sdk.mdx | 5 +- .../javascript-sdk/angular.mdx | 5 +- .../javascript-sdk/react.mdx | 5 +- .../apis-integrations/javascript-sdk/vue.mdx | 5 +- .../rest-api/query-format.mdx | 36 +- docs/pages/product/caching/_meta.js | 1 + .../getting-started-pre-aggregations.mdx | 486 +----------------- .../caching/matching-pre-aggregations.mdx | 118 +++++ .../caching/using-pre-aggregations.mdx | 48 +- docs/pages/product/data-modeling/_meta.js | 3 +- docs/pages/product/data-modeling/concepts.mdx | 122 ++++- docs/pages/product/data-modeling/queries.mdx | 161 ++++++ docs/pages/product/faqs/general.mdx | 7 - docs/pages/reference/configuration/config.mdx | 20 +- .../configuration/environment-variables.mdx | 24 +- docs/pages/reference/graphql-api.mdx | 10 +- 17 files changed, 503 insertions(+), 557 deletions(-) create mode 100644 docs/pages/product/caching/matching-pre-aggregations.mdx create mode 100644 docs/pages/product/data-modeling/queries.mdx diff --git a/docs/pages/product/apis-integrations/graphql-api.mdx b/docs/pages/product/apis-integrations/graphql-api.mdx index a40b70d2841fb..05a64abceb792 100644 --- a/docs/pages/product/apis-integrations/graphql-api.mdx +++ b/docs/pages/product/apis-integrations/graphql-api.mdx @@ -26,7 +26,7 @@ following limitations: - No support for the [WebSockets transport][ref-websockets]. - No support for [subscriptions to changes][ref-subscriptions]. - No support for referencing [segments][ref-segments] in queries. -- No support for [`compareDateRange` queries][ref-compare-date-range]. +- No support for [compare date range queries][ref-compare-date-range]. - No support for querying [data model metadata][ref-metadata]. - No ability to apply a [pivot config][ref-pivot-config] on the front-end. @@ -239,7 +239,7 @@ to our query as follows: [ref-ref-graphql-api]: /reference/graphql-api [ref-websockets]: /product/apis-integrations/rest-api/real-time-data-fetch#web-sockets [ref-subscriptions]: /product/apis-integrations/rest-api/real-time-data-fetch#client-subscriptions -[ref-compare-date-range]: /product/apis-integrations/rest-api/query-format#time-dimensions-format +[ref-compare-date-range]: /product/data-modeling/queries#compare-date-range-query [ref-metadata]: /reference/rest-api#v1meta [ref-pivot-config]: /reference/frontend/cubejs-client-core#pivotconfig [ref-segments]: /reference/data-model/segments \ No newline at end of file diff --git a/docs/pages/product/apis-integrations/javascript-sdk.mdx b/docs/pages/product/apis-integrations/javascript-sdk.mdx index d6245b514ef3b..efd9c994e5ada 100644 --- a/docs/pages/product/apis-integrations/javascript-sdk.mdx +++ b/docs/pages/product/apis-integrations/javascript-sdk.mdx @@ -40,7 +40,7 @@ as a [chart](/reference/frontend/cubejs-client-core#chart-pivot) or as a **Simplify work with complex query types.** You can build [Drill Down](/reference/frontend/cubejs-client-core#drill-down) queries and [decompose](/reference/frontend/cubejs-client-core#decompose) the results of -[compareDateRange](/product/apis-integrations/rest-api/query-format#time-dimensions-format) +[compare date range queries][ref-compare-date-range] or [Data Blending](/product/data-modeling/concepts/data-blending) queries. [Learn more](/reference/frontend/cubejs-client-core) in the documentation for @@ -95,3 +95,6 @@ yarn add @cubejs-client/core Now you can build your application from scratch or connect to one of our [supported data visualization tools](/product/configuration/visualization-tools). You can also [explore example applications](/guides/examples) built with Cube. + + +[ref-compare-date-range]: /product/data-modeling/queries#compare-date-range-query \ No newline at end of file diff --git a/docs/pages/product/apis-integrations/javascript-sdk/angular.mdx b/docs/pages/product/apis-integrations/javascript-sdk/angular.mdx index a01430f3b034d..14f36ede68e7d 100644 --- a/docs/pages/product/apis-integrations/javascript-sdk/angular.mdx +++ b/docs/pages/product/apis-integrations/javascript-sdk/angular.mdx @@ -40,7 +40,7 @@ as a [chart](/reference/frontend/cubejs-client-core#chart-pivot) or as a **Simplify work with complex query types.** You can build [Drill Down](/reference/frontend/cubejs-client-core#drill-down) queries and [decompose](/reference/frontend/cubejs-client-core#decompose) the results of -[compareDateRange](/product/apis-integrations/rest-api/query-format#time-dimensions-format) +[compare date range queries][ref-compare-date-range] or [Data Blending](/product/data-modeling/concepts/data-blending) queries. [Learn more](/reference/frontend/cubejs-client-core) in the documentation for @@ -106,3 +106,6 @@ yarn add @cubejs-client/core @cubejs-client/ngx Now you can build your application from scratch or connect to one of our [supported data visualization tools](/product/configuration/visualization-tools). You can also [explore example applications](/guides/examples) built with Cube. + + +[ref-compare-date-range]: /product/data-modeling/queries#compare-date-range-query \ No newline at end of file diff --git a/docs/pages/product/apis-integrations/javascript-sdk/react.mdx b/docs/pages/product/apis-integrations/javascript-sdk/react.mdx index 6f80d458b5fde..d7fe19c37554c 100644 --- a/docs/pages/product/apis-integrations/javascript-sdk/react.mdx +++ b/docs/pages/product/apis-integrations/javascript-sdk/react.mdx @@ -40,7 +40,7 @@ as a [chart](/reference/frontend/cubejs-client-core#chart-pivot) or as a **Simplify work with complex query types.** You can build [Drill Down](/reference/frontend/cubejs-client-core#drill-down) queries and [decompose](/reference/frontend/cubejs-client-core#decompose) the results of -[compareDateRange](/product/apis-integrations/rest-api/query-format#time-dimensions-format) +[compare date range queries][ref-compare-date-range] or [Data Blending](/product/data-modeling/concepts/data-blending) queries. [Learn more](/reference/frontend/cubejs-client-core) in the documentation for @@ -113,3 +113,6 @@ yarn add @cubejs-client/core @cubejs-client/react Now you can build your application from scratch or connect to one of our [supported data visualization tools](/product/configuration/visualization-tools). You can also [explore example applications](/guides/examples) built with Cube. + + +[ref-compare-date-range]: /product/data-modeling/queries#compare-date-range-query \ No newline at end of file diff --git a/docs/pages/product/apis-integrations/javascript-sdk/vue.mdx b/docs/pages/product/apis-integrations/javascript-sdk/vue.mdx index bf049349f0562..eccf595824192 100644 --- a/docs/pages/product/apis-integrations/javascript-sdk/vue.mdx +++ b/docs/pages/product/apis-integrations/javascript-sdk/vue.mdx @@ -40,7 +40,7 @@ as a [chart](/reference/frontend/cubejs-client-core#chart-pivot) or as a **Simplify work with complex query types.** You can build [Drill Down](/reference/frontend/cubejs-client-core#drill-down) queries and [decompose](/reference/frontend/cubejs-client-core#decompose) the results of -[compareDateRange](/product/apis-integrations/rest-api/query-format#time-dimensions-format) +[compare date range queries][ref-compare-date-range] or [Data Blending](/product/data-modeling/concepts/data-blending) queries. [Learn more](/reference/frontend/cubejs-client-core) in the documentation for @@ -111,3 +111,6 @@ yarn add @cubejs-client/core @cubejs-client/vue3 Now you can build your application from scratch or connect to one of our [supported data visualization tools](/product/configuration/visualization-tools). You can also [explore example applications](/guides/examples) built with Cube. + + +[ref-compare-date-range]: /product/data-modeling/queries#compare-date-range-query \ No newline at end of file diff --git a/docs/pages/product/apis-integrations/rest-api/query-format.mdx b/docs/pages/product/apis-integrations/rest-api/query-format.mdx index e7e5543394f4a..ef108c10e25f7 100644 --- a/docs/pages/product/apis-integrations/rest-api/query-format.mdx +++ b/docs/pages/product/apis-integrations/rest-api/query-format.mdx @@ -34,23 +34,18 @@ A Query has the following properties: It is an array of objects in [timeDimension format.](#time-dimensions-format) - `segments`: An array of segments. A segment is a named filter, created in the data model. -- `limit`: A row limit for your query. The default value is `10000`. The maximum - allowed limit is `50000`. If you'd like to request more rows than the maximum - allowed limit, consider using [pagination][ref-recipe-pagination]. -- `total`: If set to `true`, Cube will return the total number of rows for a - query. The default value is `false`. +- `limit`: A [row limit][ref-row-limit] for your query. +- `total`: If set to `true`, Cube will run a [total query][ref-total-query] and +return the total number of rows as if no row limit or offset are set in the query. +The default value is `false`. - `offset`: The number of initial rows to be skipped for your query. The default value is `0`. - `order`: An object, where the keys are measures or dimensions to order by and their corresponding values are either `asc` or `desc`. The order of the fields to order on is based on the order of the keys in the object. -- `timezone`: All time based calculations performed within Cube are - timezone-aware. This property is applied to all time dimensions during - aggregation and filtering. It isn't applied to the time dimension referenced - in a `dimensions` query property unless granularity or date filter is - specified. Using this property you can set your desired timezone in - [TZ Database Name](https://en.wikipedia.org/wiki/Tz_database) format, e.g.: - `America/Los_Angeles`. The default value is `UTC`. +- `timezone`: A [time zone][ref-time-zone] for your query. You can set the +desired time zone in the [TZ Database Name](https://en.wikipedia.org/wiki/Tz_database) +format, e.g., `America/Los_Angeles`. - `renewQuery`: If `renewQuery` is set to `true`, Cube will renew all [`refreshKey`][ref-schema-ref-preaggs-refreshkey] for queries and query results in the foreground. However, if the @@ -536,8 +531,8 @@ provides a convenient shortcut to pass a dimension and a filter as a specified it's equivalent to passing two of the same dates as a date range. You can also pass a string with a [relative date range][ref-relative-date-range], for example, `last quarter`. -- `compareDateRange`: An array of date ranges to compare a measure change over - previous period +- `compareDateRange`: An array of date ranges to compare measure values. See +[compare date range queries][ref-compare-date-range] for details. - `granularity`: A granularity for a time dimension. It supports the following values `second`, `minute`, `hour`, `day`, `week`, `month`, `quarter`, `year`. If you pass `null` to the granularity, Cube will only perform filtering by a @@ -556,10 +551,10 @@ provides a convenient shortcut to pass a dimension and a filter as a } ``` -You can use compare date range queries when you want to see, for example, how a -metric performed over a period in the past and how it performs now. You can pass -two or more date ranges where each of them is in the same format as a -`dateRange` +You can use [compare date range queries][ref-compare-date-range] when you want +to see, for example, how a metric performed over a period in the past and how it +performs now. You can pass two or more date ranges where each of them is in the +same format as a `dateRange`: ```javascript // ... @@ -608,7 +603,6 @@ refer to its documentation for more examples. -[ref-recipe-pagination]: /guides/recipes/queries/pagination [ref-client-core-resultset-drilldown]: /reference/frontend/cubejs-client-core#result-set-drill-down [ref-schema-ref-preaggs-refreshkey]: @@ -621,3 +615,7 @@ refer to its documentation for more examples. /reference/data-model/pre-aggregations#time_dimension [ref-relative-date-range]: #relative-date-range [chrono-website]: https://github.com/wanasit/chrono +[ref-row-limit]: /product/data-modeling/queries#row-limit +[ref-time-zone]: /product/data-modeling/queries#time-zone +[ref-compare-date-range]: /product/data-modeling/queries#compare-date-range-query +[ref-total-query]: /product/data-modeling/queries#total-query \ No newline at end of file diff --git a/docs/pages/product/caching/_meta.js b/docs/pages/product/caching/_meta.js index 92881a802c871..1d8700a1c1376 100644 --- a/docs/pages/product/caching/_meta.js +++ b/docs/pages/product/caching/_meta.js @@ -1,6 +1,7 @@ module.exports = { "getting-started-pre-aggregations": "Getting started with pre-aggregations", "using-pre-aggregations": "Using pre-aggregations", + "matching-pre-aggregations": "Matching pre-aggregations", "lambda-pre-aggregations": "Lambda pre-aggregations", "running-in-production": "Running in production" } \ No newline at end of file diff --git a/docs/pages/product/caching/getting-started-pre-aggregations.mdx b/docs/pages/product/caching/getting-started-pre-aggregations.mdx index 43a98171c9fd5..ebedf8f07dc3b 100644 --- a/docs/pages/product/caching/getting-started-pre-aggregations.mdx +++ b/docs/pages/product/caching/getting-started-pre-aggregations.mdx @@ -218,483 +218,6 @@ data][ref-caching-preaggs-refresh]; if a change in the refresh key is detected, the pre-aggregations are rebuilt. These refreshes are performed in the background as a scheduled process, unless configured otherwise. -## Ensuring pre-aggregations are targeted by queries - -Cube selects the best available pre-aggregation based on the incoming queries it -receives via the API. The process for selection is summarized below: - -1. Are all measures of type `count`, `sum`, `min`, `max` or - `count_distinct_approx`? - -2. If yes, then check if - - - The pre-aggregation contains all dimensions, filter dimensions and leaf - measures from the query - - The measures aren't multiplied ([via a `one_to_many` - relationship][ref-schema-joins-rel]) - -3. If no, then check if - - - The query's time dimension granularity is set - - All query filter dimensions are included in query dimensions - - The pre-aggregation defines the **exact** set of dimensions and measures - used in the query - -You can find a complete flowchart [here][self-select-pre-agg]. - -### Additivity - -So far, we've described pre-aggregations as aggregated versions of your existing -data. However, there are some rules that apply when Cube uses the -pre-aggregation. The **additivity** of fields specified in both the query and in -the pre-aggregation determines this. - -So what is additivity? Let's add another cube called `line_items` to the -previous example to demonstrate. Many `line_items` can belong to any order from -the `orders` cube, and are [joined][ref-schema-joins] as such: - - - -```yaml -cubes: - - name: line_items - sql_table: line_items - - joins: - - name: orders - sql: "{CUBE}.order_id = {orders.id}" - relationship: many_to_one - - measures: - - name: count - type: count - - dimensions: - - name: id - sql: id - type: number - primary_key: true - - - name: created_at - sql: created_at - type: time -``` - -```javascript -cube(`line_items`, { - sql_table: `line_items`, - - joins: { - orders: { - sql: `${CUBE}.order_id = ${orders.id}`, - relationship: `many_to_one`, - }, - }, - - measures: { - count: { - type: `count`, - }, - }, - - dimensions: { - id: { - sql: `id`, - type: `number`, - primary_key: true, - }, - - created_at: { - sql: `created_at`, - type: `time`, - }, - }, -}); -``` - - - -Some sample data from the `line_items` table might look like: - -| **id** | **product_id** | **order_id** | **quantity** | **price** | **profit_margin** | **created_at** | -| ------ | -------------- | ------------ | ------------ | --------- | ----------------- | -------------------------- | -| 1 | 31 | 1 | 1 | 275 | 1 | 2021-01-20 00:00:00.000000 | -| 2 | 49 | 2 | 6 | 248 | 0.1 | 2021-01-20 00:00:00.000000 | -| 3 | 89 | 3 | 6 | 197 | 0.35 | 2021-01-21 00:00:00.000000 | -| 4 | 71 | 4 | 8 | 223 | 0.15 | 2021-01-21 00:00:00.000000 | -| 5 | 64 | 5 | 5 | 75 | 0.75 | 2021-01-22 00:00:00.000000 | -| 6 | 62 | 6 | 8 | 75 | 0.65 | 2021-01-22 00:00:00.000000 | - -Looking at the raw data, we can see that if the data were to be aggregated by -`created_at`, then we could simply add together the `quantity` and `price` -fields and still get a correct result: - -| **created_at** | **quantity** | **price** | -| -------------------------- | ------------ | --------- | -| 2021-01-20 00:00:00.000000 | 7 | 523 | -| 2021-01-21 00:00:00.000000 | 14 | 420 | -| 2021-01-22 00:00:00.000000 | 13 | 150 | - -This means that `quantity` and `price` are both **additive measures**, and we -can represent them in the `line_items` cube as follows: - - - -```yaml -cubes: - - name: line_items - # ... - - measures: - # ... - - - name: quantity - sql: quantity - type: sum - - - name: price - type: sum - sql: price - format: currency - - # ... -``` - -```javascript -cube(`line_items`, { - // ... - - measures: { - // ... - - quantity: { - sql: `quantity`, - type: `sum`, - }, - - price: { - type: `sum`, - sql: `price`, - format: `currency`, - }, - }, - - // ... -}); -``` - - - -Because neither `quantity` and `price` reference any other measures in our -`line_items` cube, we can also say that they are **additive leaf measures**. Any -query requesting only these two measures can be called a **leaf measure -additive** query. Additive leaf measures can only be of the following -[types][ref-schema-types-measure]: `count`, `sum`, `min`, `max` or -`count_distinct_approx`. - -[ref-schema-types-measure]: /reference/data-model/types-and-formats#measure-types - -### Non-Additivity - -Using the same sample data for `line_items`, there's a `profit_margin` field -which is different for each row. However, despite the value being numerical, it -doesn't actually make sense to add up this value. Let's look at the rows for -`2021-01-20` in the sample data: - -| **id** | **product_id** | **order_id** | **quantity** | **price** | **profit_margin** | **created_at** | -| ------ | -------------- | ------------ | ------------ | --------- | ----------------- | -------------------------- | -| 1 | 31 | 1 | 1 | 275 | 1 | 2021-01-20 00:00:00.000000 | -| 2 | 49 | 2 | 6 | 248 | 0.1 | 2021-01-20 00:00:00.000000 | - -And now let's try and aggregate them: - -| **created_at** | **quantity** | **price** | **profit_margin** | -| -------------------------- | ------------ | --------- | ----------------- | -| 2021-01-20 00:00:00.000000 | 7 | 523 | 1.1 | - -Using the source data, we'll manually calculate the profit margin and see if it -matches the above. We'll use the following formula: - -$$ -x + (x * y) = z -$$ - -Where `x` is the original cost of the item, `y` is the profit margin and `z` is -the price the item was sold for. Let's use the formula to find the original cost -for both items sold on `2021-01-20`. For the row with `id = 1`: - -$$ -x + (x * 1) = 275\\ -2x = 275\\ -x = 275 / 2\\ -x = 137.5 -$$ - -And for the row where `id = 2`: - -$$ -x + (x * 0.1) = 248\\ -1.1x = 248\\ -x = 248 / 1.1\\ -x = 225.454545454545455 -$$ - -Which means the total cost for both items was: - -$$ -225.454545454545455 + 137.5\\ -362.954545454545455 -$$ - -Now that we have the cost of each item, let's use the same formula in reverse to -see if applying a profit margin of `1.1` will give us the same total price -(`523`) as calculated earlier: - -$$ -362.954545454545455 + (362.954545454545455 * 1.1) = z\\ -762.204545454545455 = z\\ -z = 762.204545454545455 -$$ - -We can clearly see that `523` **does not** equal `762.204545454545455`, and we -cannot treat the `profit_margin` column the same as we would any other additive -measure. Armed with the above knowledge, we can add the `profit_margin` field to -our cube **as a [dimension][ref-schema-dims]**: - - - -```yaml -cubes: - - name: line_items - # ... - - dimensions: - # ... - - - name: profit_margin - sql: profit_margin - type: number - format: percent - - # ... -``` - -```javascript -cube(`line_items`, { - // ... - - dimensions: { - // ... - - profit_margin: { - sql: `profit_margin`, - type: `number`, - format: "percentage", - }, - }, - - // ... -}); -``` - - - -Another approach might be to calculate the profit margin dynamically, and -instead saving the "cost" price. Because the cost price is an additive measure, -we are able to store it in a pre-aggregation: - - - -```yaml -cubes: - - name: line_items - # ... - - measures: - # ... - - - name: cost - sql: "{CUBE.price} / (1 + {CUBE.profit_margin}" - type: sum - - # ... -``` - -```javascript -cube(`line_items`, { - // ... - - measures: { - // ... - - cost: { - sql: `${CUBE.price} / (1 + ${CUBE.profit_margin})`, - type: `sum`, - }, - }, - - // ... -}); -``` - - - -Another example of a non-additive measure would be a distinct count of -`product_id`. If we took the distinct count of products sold over a month, and -then tried to sum the distinct count of products for each individual day and -compared them, we would not get the same results. We can add the measure like -this: - - - -```yaml -cubes: - - name: line_items - # ... - - measures: - # ... - - - name: count_distinct_products - sql: product_id - type: count_distinct - - # ... -``` - -```javascript -cube(`line_items`, { - // ... - - measures: { - // ... - - count_distinct_products: { - sql: `product_id`, - type: `count_distinct`, - }, - }, - - // ... -}); -``` - - - -However the above cannot be used in for a pre-aggregation. We can instead change -the `type` to `count_distinct_approx`, and then use the measure in a -pre-aggregation definition: - - - -```yaml -cubes: - - name: line_items - # ... - - measures: - # ... - - - name: count_distinct_products - sql: product_id - type: count_distinct_approx - - pre_aggregations: - - name: my_rollup - # ... - - measures: - - count_distinct_products - - # ... -``` - -```javascript -cube(`line_items`, { - // ... - - measures: { - // ... - - count_distinct_products: { - sql: `product_id`, - type: `count_distinct_approx`, - }, - }, - - pre_aggregations: { - my_rollup: { - // ... - - measures: [count_distinct_products], - }, - }, - - // ... -}); -``` - - - -### Selecting the pre-aggregation - -To recap what we've learnt so far: - -- **Additive measures** are measures whose values can be added together - -- **Multiplied measures** are measures that define `one_to_many` relationships - -- **Leaf measures** are measures that do not reference any other measures in - their definition - -- **Calculated measures** are measures that reference other dimensions and - measures in their definition - -- A query is **leaf measure additive** if all of its leaf measures are one of: - `count`, `sum`, `min`, `max` or `count_distinct_approx` - -Cube looks for matching pre-aggregations in the order they are defined in a -cube's data model file. Each defined pre-aggregation is then tested for a match -based on the criteria in the flowchart below: - -
- Pre-Aggregation Selection Flowchart -
- -Some extra considerations for pre-aggregation selection: - -- The query's time dimension and granularity must match the pre-aggregation. - -- The query's time dimension and granularity together act as a dimension. If the - date range isn't aligned with granularity, a common granularity is used. This - common granularity is selected using the [greatest common divisor][wiki-gcd] - across both the query and pre-aggregation. For example, the common granularity - between `hour` and `day` is `hour` because both `hour` and `day` can be - divided by `hour`. - -- The query's granularity's date range must match the start date and end date - from the time dimensions. For example, when using a granularity of `month`, - the values should be the start and end days of the month i.e. - `['2020-01-01T00:00:00.000', '2020-01-31T23:59:59.999']`; when the granularity - is `day`, the values should be the start and end hours of the day i.e. - `['2020-01-01T00:00:00.000', '2020-01-01T23:59:59.999']`. Date ranges are - inclusive, and the minimum granularity is `second`. - -- The order in which pre-aggregations are defined in models matter; the first - matching pre-aggregation for a query is the one that is used. Both the - measures and dimensions of any cubes specified in the query are checked to - find a matching `rollup`. - -- `rollup` pre-aggregations **always** have priority over `original_sql`. Thus, - if you have both `original_sql` and `rollup` defined, Cube will try to match - `rollup` pre-aggregations before trying to match `original_sql`. You can - instruct Cube to use the original SQL pre-aggregations by using - [`use_original_sql_pre_aggregations`][ref-schema-preaggs-origsql]. [ref-caching-preaggs-cubestore]: /product/caching/using-pre-aggregations#pre-aggregations-storage @@ -702,11 +225,4 @@ Some extra considerations for pre-aggregation selection: /product/caching/using-pre-aggregations#refresh-strategy [ref-caching-preaggs-storage]: /product/caching/using-pre-aggregations#pre-aggregations-storage -[ref-schema-dims]: /reference/data-model/dimensions -[ref-schema-joins]: /reference/data-model/joins -[ref-schema-joins-rel]: /reference/data-model/joins#relationship -[ref-schema-preaggs]: /reference/data-model/pre-aggregations -[ref-schema-preaggs-origsql]: - /reference/data-model/pre-aggregations#original_sql -[self-select-pre-agg]: #selecting-the-pre-aggregation -[wiki-gcd]: https://en.wikipedia.org/wiki/Greatest_common_divisor +[ref-schema-preaggs]: /reference/data-model/pre-aggregations \ No newline at end of file diff --git a/docs/pages/product/caching/matching-pre-aggregations.mdx b/docs/pages/product/caching/matching-pre-aggregations.mdx new file mode 100644 index 0000000000000..373f831cdc290 --- /dev/null +++ b/docs/pages/product/caching/matching-pre-aggregations.mdx @@ -0,0 +1,118 @@ +# Matching queries with pre-aggregations + +When executing a query, Cube will try to match and fulfill it with the best +available pre-aggregation. + +Since pre-aggregations contain a *condensed representation* of the data from +the upstream data source (rather than a copy of that data), Cube needs to ensure +that fulfilling a query with a pre-aggregation is possible and doing so will +produce correct results. + +If there's no matching pre-aggregation, Cube will fall back to querying +the upstream data source, unless the [rollup-only mode][ref-rollup-only-mode] +is enabled. + + + +If you don't know why a query doesn't match a pre-aggregation, check +[common pitfalls](#common-pitfalls) first. + + + +## Eligible pre-aggregations + +Cube will search for matching pre-aggregations in all cubes that define +members in the query. + +Pre-aggregations are tested in the order they are defined in the data model +file. However, `rollup` pre-aggregations are tested before `original_sql` +pre-aggregations. + +The first pre-aggregation that matches a query is be used. + +## Matching algorithm + +Cube goes through the following steps to determine whether a query matches a +particular eligible pre-aggregation: + + + +See the details for each step: + +- **Is query leaf-measure additive?** Cube checks that all [leaf +measures][ref-leaf-measures] in the query are [additive][ref-measure-additivity]. +If the query contains [calculated measures][ref-calculated-measures] (e.g., +measures defined as `{sum} / {count}`), then referenced leaf measures will be +checked for additivity. +- **Does every member of the query exist in the pre-aggregation?** Cube checks +that the pre-aggregation contains all dimensions, filter dimensions, and leaf +measures from the query. +- **Are any query measures multiplied in the cube's data schema?** Cube checks +if any measures are multiplied via a [`one_to_many` +relationship][ref-schema-joins-rel] between cubes in the query. +- **Does the query specify granularity for its time dimension?** Cube checks +that the time dimension granularity is set in the query. +- **Are query filter dimensions included in its own dimensions?** Cube checks +that all filter dimensions are also included as dimensions in the query. +- **Does every member in the query exist in the pre-aggregation?** Cube checks +that the pre-aggregation contains all dimensions and measures used in the query. + +### Matching time dimensions + +There are extra considerations that apply to matching time dimensions. + +- **Time dimension and granularity in the query together act as a dimension.** +If the date range isn't aligned with granularity, a common granularity is used. +This common granularity is selected using the [greatest common divisor][wiki-gcd] +across both the query and pre-aggregation. For example, the common granularity +between `hour` and `day` is `hour` because both `hour` and `day` can be divided +by `hour`. +- **The query's granularity's date range must match the start date and end date +from time dimensions.** For example, when using a granularity of `month`, +the values should be the start and end days of the month, i.e., +`['2020-01-01T00:00:00.000', '2020-01-31T23:59:59.999']`; when the granularity +is `day`, the values should be the start and end hours of the day, i.e., +`['2020-01-01T00:00:00.000', '2020-01-01T23:59:59.999']`. Date ranges are +inclusive, and the minimum granularity is `second`. Use the +[`allow_non_strict_date_range_match`][ref-non-strict-date-range-match] to allow +a pre-aggregation to match a non-strict date range anyway. +- **The time zone in the query must match the time zone of a pre-aggregation.** +You can configure a list of time zones that pre-aggregations will be built for +using the [`scheduled_refresh_time_zones`][ref-conf-scheduled-refresh-time-zones] +configuration option. + +### Matching ungrouped queries + +There are extra considerations that apply to matching [ungrouped +queries][ref-ungrouped-queries]: + +- The pre-aggregation should include [primary keys][ref-primary-key] of all +cubes involved in the query. +- If multiple cubes are referenced in the query, the pre-aggregation should +include only members of these cubes. + +## Common pitfalls + +- Most commonly, a query would not match a pre-aggregation because they contain +[non-additive measures][ref-measure-additivity]. See [this +recipe][ref-non-additive-recipe] for workarounds. + +- If a query uses any time zone other than `UTC`, please check the section on +[matching time dimensions](#matching-time-dimensions) and the +[`scheduled_refresh_time_zones`][ref-conf-scheduled-refresh-time-zones] +configuration option. + + +[ref-rollup-only-mode]: /product/caching/using-pre-aggregations#rollup-only-mode +[ref-schema-dims]: /reference/data-model/dimensions +[ref-schema-joins]: /reference/data-model/joins +[ref-schema-joins-rel]: /reference/data-model/joins#relationship +[wiki-gcd]: https://en.wikipedia.org/wiki/Greatest_common_divisor +[ref-measure-additivity]: /product/data-modeling/concepts#measure-additivity +[ref-leaf-measures]: /product/data-modeling/concepts#leaf-measures +[ref-calculated-measures]: /product/data-modeling/overview#4-using-calculated-measures +[ref-non-strict-date-range-match]: /reference/data-model/pre-aggregations#allow_non_strict_date_range_match +[ref-non-additive-recipe]: /guides/recipes/query-acceleration/non-additivity +[ref-conf-scheduled-refresh-time-zones]: /reference/configuration/config#scheduled_refresh_time_zones +[ref-ungrouped-queries]: /product/data-modeling/queries#ungrouped-query +[ref-primary-key]: /reference/data-model/dimensions#primary_key \ No newline at end of file diff --git a/docs/pages/product/caching/using-pre-aggregations.mdx b/docs/pages/product/caching/using-pre-aggregations.mdx index a204e3be41852..a812e157dccdd 100644 --- a/docs/pages/product/caching/using-pre-aggregations.mdx +++ b/docs/pages/product/caching/using-pre-aggregations.mdx @@ -10,6 +10,33 @@ configuration options to consider. Please make sure to also check [the Pre-Aggregations reference in the data modeling section][ref-schema-ref-preaggs]. +## Matching queries + +When executing a query, Cube will try to [match and fulfill it with a +pre-aggregation][ref-matching-preaggs] in the first place. + +If there's no matching pre-aggregation, Cube will query the upstream data +source instead, unless the [rollup-only mode](#rollup-only-mode) is enabled. + +## Rollup-only mode + +In the rollup-only mode, Cube will **only** fulfill queries using +pre-aggregations. To enable the rollup-only mode, use the +`CUBEJS_ROLLUP_ONLY` environment variable. + +It can be useful to prevent queries from your end users from ever hitting the +upstream data source, e.g., if you prefer to use your data warehouse only to +build and refresh pre-aggregations and keep it suspended the rest of the time. + + + +When the rollup-only mode is used with a single-node deployment (where the API +instance also serves as a [refresh worker][ref-deploy-refresh-wrkr]), queries +that can't be fulfilled with pre-aggregations will result in an error. +Scheduled refreshes will continue to work in the background. + + + ## Refresh strategy Refresh strategy can be customized by setting the @@ -144,22 +171,6 @@ When `every` and `sql` are used together, Cube will run the query from the `sql` property on an interval defined by the `every` property. If the query returns new results, then the pre-aggregation will be refreshed. -## Rollup-only mode - -To make Cube _only_ serve requests from pre-aggregations, the -[`CUBEJS_ROLLUP_ONLY`][ref-config-env-rolluponly] environment variable can be -set to `true` on an API instance. This will prevent serving data on API requests -from the source database. - - - -When using this configuration in a single node deployment (where the API -instance and [Refresh Worker][ref-deploy-refresh-wrkr] are configured on the -same host), requests made to the API that cannot be satisfied by a rollup throw -an error. Scheduled refreshes will continue to work in the background. - - - ## Partitioning [Partitioning][wiki-partitioning] is an extremely effective optimization for @@ -961,8 +972,6 @@ streaming engine. [ref-caching-in-mem-default-refresh-key]: /product/caching#default-refresh-keys [ref-config-db]: /product/configuration/data-sources [ref-config-driverfactory]: /reference/configuration/config#driverfactory -[ref-config-env-rolluponly]: - /reference/configuration/environment-variables#cubejs-rollup-only [ref-config-extdriverfactory]: /reference/configuration/config#externaldriverfactory [ref-connect-db-athena]: /product/configuration/data-sources/aws-athena @@ -993,4 +1002,5 @@ streaming engine. [wiki-partitioning]: https://en.wikipedia.org/wiki/Partition_(database) [ref-ref-indexes]: /reference/data-model/pre-aggregations#indexes [ref-additivity]: /product/caching/getting-started-pre-aggregations#additivity -[ref-ref-index-type]: /reference/data-model/pre-aggregations#type-1 \ No newline at end of file +[ref-ref-index-type]: /reference/data-model/pre-aggregations#type-1 +[ref-matching-preaggs]: /product/caching/matching-pre-aggregations \ No newline at end of file diff --git a/docs/pages/product/data-modeling/_meta.js b/docs/pages/product/data-modeling/_meta.js index 279a8a6d109e1..e04001cec0768 100644 --- a/docs/pages/product/data-modeling/_meta.js +++ b/docs/pages/product/data-modeling/_meta.js @@ -2,5 +2,6 @@ module.exports = { "overview": "Overview", "concepts": "Concepts", "syntax": "Syntax", - "dynamic": "Dynamic data models" + "dynamic": "Dynamic data models", + "queries": "Queries" } \ No newline at end of file diff --git a/docs/pages/product/data-modeling/concepts.mdx b/docs/pages/product/data-modeling/concepts.mdx index 31cecb4893c88..849f10c13d890 100644 --- a/docs/pages/product/data-modeling/concepts.mdx +++ b/docs/pages/product/data-modeling/concepts.mdx @@ -402,8 +402,8 @@ cubes: ### Measure types -Measures can be of different types, and you can find them all -[here][ref-schema-measure-types]. +Measures can be of different types. See the [measure type +reference][ref-schema-measure-types] for details. Often, aggregate functions in SQL are mapped to measure types in the following way: @@ -421,6 +421,120 @@ Often, aggregate functions in SQL are mapped to measure types in the following w | `SUM` | [`sum`](/reference/data-model/types-and-formats#sum) | | Any function returning a timestamp, e.g., `MAX(time)` | [`time`](/reference/data-model/types-and-formats#time) | +### Measure additivity + +Additivity is a property of measures that detemines whether measure values, +once calculated for a set of dimensions, can be further aggregated to calculate +measure values for a subset of these dimensions. + +Measure additivity has an impact on [pre-aggregation +matching][ref-matching-preaggs]. + +Additivity of a measure depends on its [type](#measure-types). Only measures +with the following types are considered *additive*: +[`count`](/reference/data-model/types-and-formats#count), +[`count_distinct_approx`](/reference/data-model/types-and-formats#count_distinct_approx), +[`min`](/reference/data-model/types-and-formats#min), +[`max`](/reference/data-model/types-and-formats#max), +[`sum`](/reference/data-model/types-and-formats#sum). +Measures with all other types are considered *non-additive*. + +#### Example + +Consider the following cube: + + + +```yaml +cubes: + - name: employees + sql: > + SELECT 1 AS id, 'Ali' AS first_name, 20 AS age, 'Los Gatos' AS city UNION ALL + SELECT 2 AS id, 'Bob' AS first_name, 30 AS age, 'San Diego' AS city UNION ALL + SELECT 3 AS id, 'Eve' AS first_name, 40 AS age, 'San Diego' AS city + + measures: + - name: count + type: count + + - name: avg_age + sql: age + type: avg + + dimensions: + - name: city + sql: city + type: string +``` + +```javascript +cube(`employees`, { + sql: ` + SELECT 1 AS id, 'Ali' AS first_name, 20 AS age, 'Los Gatos' AS city UNION ALL + SELECT 2 AS id, 'Bob' AS first_name, 30 AS age, 'San Diego' AS city UNION ALL + SELECT 3 AS id, 'Eve' AS first_name, 40 AS age, 'San Diego' AS city + `, + + measures: { + count: { + type: `count` + }, + + avg_age: { + sql: `age`, + type: `avg` + } + }, + + dimensions: { + city: { + sql: `city`, + type: `string` + } + } +}) +``` + + + +If we run a query that includes `city` as a dimension and `count` and `avg_age` +as measures, we'll get the following results: + +| city | count | avg_age | +| --------- | ----- | ------- | +| Los Gatos | 1 | 20 | +| San Diego | 2 | 35 | + +Then, if we remove the `city` dimension from the query, we'll get the following +results: + +| count | avg_age | +| ----- | ------- | +| 3 | 30 | + +As you can see, the value of the `count` measure that we've got for the second +query could have been calculated based on the results of the first one: +`1 + 2 = 3`. It explains why the `count` measure, having the `count` type, is +considered *additive*. + +However, the value of the `avg_age` measure that we've got for the second query +can't be calculated based on the results of the first one: there's no way to +derive `30` from `20` and `35`. This is why the `avg_age` measure, having the +`avg` type, is considered *non-additive*. + +### Leaf measures + +Measures that do not [reference][ref-syntax-references] other measures are +considered *leaf measures*. + +By definition, all measures that only reference SQL +[columns][ref-syntax-references-column] and expressions are *leaf measures*. +On the other hand, [calculated measures][ref-calculated-measures] might not +necessarily be *leaf measures* because they can reference other measures. + +Whether a query contains only [additive](#measure-additivity) leaf measures has +an impact on [pre-aggregation matching][ref-matching-preaggs]. + ## Joins Joins define the relationships between cubes, which then allows accessing and @@ -577,3 +691,7 @@ Pre-Aggregations][ref-caching-preaggs-intro]. [self-measures]: #measures [wiki-olap]: https://en.wikipedia.org/wiki/Online_analytical_processing [wiki-view-sql]: https://en.wikipedia.org/wiki/View_(SQL) +[ref-matching-preaggs]: /product/caching/matching-pre-aggregations +[ref-syntax-references]: /product/data-modeling/syntax#references +[ref-syntax-references-column]: /product/data-modeling/syntax#column +[ref-calculated-measures]: /product/data-modeling/overview#4-using-calculated-measures \ No newline at end of file diff --git a/docs/pages/product/data-modeling/queries.mdx b/docs/pages/product/data-modeling/queries.mdx new file mode 100644 index 0000000000000..5bcb0dccd8168 --- /dev/null +++ b/docs/pages/product/data-modeling/queries.mdx @@ -0,0 +1,161 @@ +# Querying the data model + +Ultimately, after creating a data model, you would like to *ask questions +to it*, i.e., run queries against this data model. This page describes the +concepts of querying Cube, common to [all or most APIs][ref-apis] +that you will use to run these queries. + +## Query defaults + +The following defaults apply to all queries run by Cube. + +### Row limit + +By default, [any query](#query-types) will return no more than 10,000 rows +in the result set. It serves as a safeguard against data scraping and +denial-of-service (DoS) attacks if Cube is exposed to untrusted environments. + +The maximum allowed limit is 50,000 rows. You can use the `CUBEJS_DB_QUERY_LIMIT` +environment variable to override it. You can also implement +[pagination][ref-pagination-recipe] to fetch more rows than the maximum limit. + +### Time zone + +All time-based calculations performed by Cube are time zone-aware. + +By default, Cube assumes that time values in your queries (e.g., in date range +filters) are in the [UTC time zone][wiki-utc-time-zone]. Similarly, it will use +the same time zone for time dimension values in result sets. + +You can use the `timezone` option with [REST API][ref-rest-api-query-format-options] +or [GraphQL API][ref-ref-graphql-api-args] to specify the time zone for a query. +Also, you can use the [`SQL_UTILS` context variable][ref-sql-utils] to apply the +time zone conversion to dimensions that are not used as time dimensions in a query. + +Additionally, note that time zones have impact on [pre-aggregation +matching][ref-matching-preaggs-time-dimensions]. + +## Query types + +Most commonly, you will run [regular queries](#regular-query). See the table +and the sections below for details on each query type. + +| Query type | Supported by [APIs][ref-apis] | Supported in [Playground][ref-playground] | +| --- | --- | --- | +| [Regular query](#regular-query) | [SQL API][ref-sql-api], [REST API][ref-rest-api], [GraphQL API][ref-graphql-api] | ✅ Yes | +| [Ungrouped query](#ungrouped-query) | [SQL API][ref-sql-api], [REST API][ref-rest-api], [GraphQL API][ref-graphql-api] | ❌ No | +| [Compare date range query](#compare-date-range-query) | [SQL API][ref-sql-api], [REST API][ref-rest-api] | ❌ No | +| [Total query](#total-query) | [REST API][ref-rest-api] | ❌ No | + +### Regular query + +This is the most common type of queries. Regular queries include: +- Lists of dimensions and measures that you'd like to see in the result set. +- Optionally, filters to apply before returning the result set. +- Optionally, a [row limit](#row-limit) and an offset for the result set. + +For regular queries, Cube generates the SQL for the upstream [data +sources][ref-data-sources] that always includes all dimensions in the `GROUP BY` +statement. See [ungrouped queries](#ungrouped-query) if you'd like to override +this behavior. + +#### Example + +See an example of a regular query using the SQL API syntax: + +```sql +SELECT + city, + MEASURE(amount) +FROM orders +WHERE status = 'shipped' +GROUP BY 1 +LIMIT 100 +``` + +The same query using the REST API syntax looks as follows: + +```json +{ + "dimensions": ["orders.city"], + "measures": ["orders.amount"], + "filters": [ + { + "member": "orders.status", + "operator": "equals", + "values": ["shipped"] + } + ], + "limit": 100 +} +``` + +### Ungrouped query + +Similarly to [regular queries](#regular-queries), ungrouped queries include +lists of dimensions and measures, filters, etc. and return a result set. + +However, unlike for regular queries, Cube will not add the +`GROUP BY` statement when generating the SQL for the upstream data sources. +Instead, raw results after filtering and joining will be returned without any +grouping. Measures will be rendered as their `sql` without any aggregation. +Time dimensions will be truncated by granularity as usual, however, not grouped by. + +You can make a regular query ungrouped by using the `ungrouped` option with +[REST API][ref-rest-api-query-format-options] or [GraphQL API][ref-ref-graphql-api-args]. +For the [SQL API][ref-sql-api], you can omit the `GROUP BY` statement from the +SQL API query. + +By default, for security purposes, ungrouped queries require [primary +keys][ref-primary-key] of all cubes involved in a query to be added as +dimensions. You can use the [`allow_ungrouped_without_primary_key` configration +option][ref-conf-allow-ungrouped] to override this. + +Additionally, note that ungrouped queries have additional requirements for +[pre-aggregation matching][ref-matching-preaggs-ungrouped]. + +### Compare date range query + +Similarly to [regular queries](#regular-queries), compare date range queries +include lists of dimensions and measures, filters, etc. and return a result set. + +However, unlike regular queries, they provide a convenient way to retrieve +measure values for *more than one date range* for a time dimension. See [this +blog post][blog-compare-date-range] for more details and examples. + +You can make a compare date range query by using the `compareDateRange` +option with the [REST API][ref-rest-api-query-format-options-tdf]. For the SQL +API, you can write an equivalent query using the `UNION ALL` statement. + +### Total query + +Similarly to [regular queries](#regular-queries), total queries include lists +of dimensions and measures, filters, etc. and return a result set. + +In addition to that, they provide a convenient way to retrieve the total number +of rows in the result set as if no [row limit](#row-limit) or offset are set in +the query. This is useful for creating user interfaces with +[pagination][ref-pagination-recipe]. + +You can make a total query by using the `total` option with the [REST +API][ref-rest-api-query-format-options]. For the SQL API, you can write an +equivalent query using the `UNION ALL` statement. + + +[wiki-utc-time-zone]: https://en.wikipedia.org/wiki/Coordinated_Universal_Time +[ref-playground]: /product/workspace/playground +[ref-apis]: /product/apis-integrations +[ref-sql-api]: /product/apis-integrations/sql-api +[ref-rest-api]: /product/apis-integrations/rest-api +[ref-graphql-api]: /product/apis-integrations/graphql-api +[ref-data-sources]: /product/configuration/data-sources +[ref-rest-api-query-format-options]: /product/apis-integrations/rest-api/query-format#query-properties +[ref-rest-api-query-format-options-tdf]: /product/apis-integrations/rest-api/query-format#time-dimensions-format +[ref-ref-graphql-api-args]: /reference/graphql-api#cubequeryargs +[ref-sql-utils]: /reference/data-model/context-variables#sql_utils +[ref-matching-preaggs-time-dimensions]: /product/caching/matching-pre-aggregations#matching-time-dimensions +[ref-matching-preaggs-ungrouped]: /product/caching/matching-pre-aggregations#matching-ungrouped-queries +[ref-pagination-recipe]: /guides/recipes/queries/pagination +[ref-primary-key]: /reference/data-model/dimensions#primary_key +[ref-conf-allow-ungrouped]: /reference/configuration/config#allow_ungrouped_without_primary_key +[blog-compare-date-range]: https://cube.dev/blog/comparing-data-over-different-time-periods \ No newline at end of file diff --git a/docs/pages/product/faqs/general.mdx b/docs/pages/product/faqs/general.mdx index b0dc22f385340..d6c7a4f0156d1 100644 --- a/docs/pages/product/faqs/general.mdx +++ b/docs/pages/product/faqs/general.mdx @@ -5,13 +5,6 @@ redirect_from: # General -## Is there a row limit on the results of a query? - -The row limit for all query results is set to 10,000 rows by default. You may -specify a row limit up to 50,000 in query parameters for an individual query. If -more rows are needed, we recommend using pagination in your application to -request more rows. - ## Can I try Cube Cloud for free? Yes. Cube Cloud provides free diff --git a/docs/pages/reference/configuration/config.mdx b/docs/pages/reference/configuration/config.mdx index 25984f57a96e6..fbf53fe6a2692 100644 --- a/docs/pages/reference/configuration/config.mdx +++ b/docs/pages/reference/configuration/config.mdx @@ -638,11 +638,11 @@ You may also need to configure ### `scheduled_refresh_time_zones` -All time-based calculations performed within Cube are timezone-aware. Using this -property you can specify multiple timezones in [TZ Database Name][link-wiki-tz] -format e.g. `America/Los_Angeles`. The default value is `UTC`. +This option specifies a list of time zones that pre-aggregations will be built +for. It has impact on [pre-aggregation matching][ref-matching-preaggs]. -You can define one or multiple timezones: +You can specify multiple timezones in the [TZ Database Name][link-wiki-tz] +format, e.g., `America/Los_Angeles`: @@ -666,14 +666,10 @@ module.exports = { -This configuration option can be also set using the -`CUBEJS_SCHEDULED_REFRESH_TIMEZONES` environment variable. You can set a -comma-separated list of timezones to refresh in -`CUBEJS_SCHEDULED_REFRESH_TIMEZONES` environment variable. For example: +The default value is a list of a single time zone. `UTC`. -```dotenv -CUBEJS_SCHEDULED_REFRESH_TIMEZONES=America/Los_Angeles,UTC -``` +This configuration option can be also set using the +`CUBEJS_SCHEDULED_REFRESH_TIMEZONES` environment variable. ### `scheduled_refresh_contexts` @@ -1317,7 +1313,7 @@ If not defined, Cube will lookup for environment variable [self-orchestrator-id]: #context_to_orchestrator_id [ref-multiple-data-sources]: /product/configuration/advanced/multiple-data-sources [ref-websockets]: /product/apis-integrations/rest-api/real-time-data-fetch - +[ref-matching-preaggs]: /product/caching/matching-pre-aggregations [link-snake-case]: https://en.wikipedia.org/wiki/Snake_case [link-camel-case]: https://en.wikipedia.org/wiki/Camel_case [link-github-cube-drivers]: https://github.com/cube-js/cube/tree/master/packages diff --git a/docs/pages/reference/configuration/environment-variables.mdx b/docs/pages/reference/configuration/environment-variables.mdx index 4494fcbdbd53c..25bab0195f95a 100644 --- a/docs/pages/reference/configuration/environment-variables.mdx +++ b/docs/pages/reference/configuration/environment-variables.mdx @@ -676,6 +676,22 @@ The username used to connect to the database. | ------------------------- | ---------------------- | --------------------- | | A valid database username | N/A | N/A | +## `CUBEJS_DB_QUERY_LIMIT` + +The maximum [row limit][ref-row-limit] in the result set. + +| Possible Values | Default in Development | Default in Production | +| ------------------------- | ---------------------- | --------------------- | +| A positive integer number | `50000` | `50000` | + + + +Increasing the maximum row limit may cause out-of-memory (OOM) crashes and make +Cube susceptible to denial-of-service (DoS) attacks if it's exposed to +untrusted environments. + + + ## `CUBEJS_DEFAULT_API_SCOPES` [API scopes][ref-rest-scopes] used to allow or disallow access to REST API @@ -955,7 +971,9 @@ If `true`, this instance of Cube will **only** refresh pre-aggregations. ## `CUBEJS_ROLLUP_ONLY` -If `true`, this instance of Cube will **only** query rollup pre-aggregations. +If `true`, the API instance of Cube will **only** fulfill queries from +pre-aggregations. See [rollup-only +mode](/product/caching/using-pre-aggregations#rollup-only-mode) for details. | Possible Values | Default in Development | Default in Production | | --------------- | ---------------------- | --------------------- | @@ -974,8 +992,7 @@ accordingly ## `CUBEJS_SCHEDULED_REFRESH_TIMEZONES` A comma-separated [list of timezones to schedule refreshes -for][ref-config-sched-refresh-timer]. Used in conjunction with -[`CUBEJS_SCHEDULED_REFRESH_CONCURRENCY`](#cubejs-scheduled-refresh-concurrency). +for][ref-config-sched-refresh-timer]. | Possible Values | Default in Development | Default in Production | | --------------------------------------------------------- | ---------------------- | --------------------- | @@ -1449,3 +1466,4 @@ The port for a Cube deployment to listen to API connections on. https://docs.snowflake.com/en/user-guide/warehouses.html [wiki-tz-database]: https://en.wikipedia.org/wiki/List_of_tz_database_time_zones [ref-sql-api]: /product/apis-integrations/sql-api +[ref-row-limit]: /product/data-modeling/queries#row-limit \ No newline at end of file diff --git a/docs/pages/reference/graphql-api.mdx b/docs/pages/reference/graphql-api.mdx index 2eae1bb6c4a2e..3a57e46a38dd4 100644 --- a/docs/pages/reference/graphql-api.mdx +++ b/docs/pages/reference/graphql-api.mdx @@ -27,9 +27,11 @@ query { ## `CubeQueryArgs` - **`where` ([`RootWhereInput`](#root-where-input)):** Represents a SQL `WHERE` clause. -- **`limit` (`Int`):** A row limit for your query. The default value is `10000`. The maximum allowed limit is `50000`. If you'd like to request more rows than the maximum allowed limit, consider using [pagination][ref-recipe-pagination]. +- **`limit` (`Int`):** A [row limit][ref-row-limit] for your query. - **`offset` (`Int`):** The number of initial rows to be skipped for your query. The default value is `0`. -- **`timezone` (`String`):** The timezone to use for the query. The default value is `UTC`. +- **`timezone` (`String`):** The [time zone][ref-time-zone] for your query. You can set the +desired time zone in the [TZ Database Name](https://en.wikipedia.org/wiki/Tz_database) +format, e.g., `America/Los_Angeles`. - **`renewQuery` (`Boolean`):** If `renewQuery` is set to `true`, Cube will renew all `refreshKey` for queries and query results in the foreground. The default value is `false`. - **`ungrouped` (`Boolean`):** If `ungrouped` is set to `true`, no `GROUP BY` statement will be added to the query. Instead, the raw results after filtering and joining will be returned without grouping. By default, ungrouped queries require a primary key as a dimension of every cube involved in the query for security purposes. For an ungrouped query, measures will be rendered without aggregation and time dimensions will be truncated as usual. @@ -111,4 +113,6 @@ query { [ref-schema-ref-preagg-granularity]: /reference/data-model/pre-aggregations#granularity -[ref-graphql-api]: /product/apis-integrations/graphql-api \ No newline at end of file +[ref-graphql-api]: /product/apis-integrations/graphql-api +[ref-row-limit]: /product/data-modeling/queries#row-limit +[ref-time-zone]: /product/data-modeling/queries#time-zone \ No newline at end of file