Skip to content

Commit

Permalink
Docs: Compression for Column Table (#11458)
Browse files Browse the repository at this point in the history
  • Loading branch information
vlad-gogov authored Jan 19, 2025
1 parent 050fcd1 commit 3f1aaf1
Show file tree
Hide file tree
Showing 21 changed files with 270 additions and 67 deletions.
5 changes: 5 additions & 0 deletions ydb/docs/en/core/_includes/codec_zstd_allow_for_olap_note.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
{% note alert %}

{% include [codec_zstd_allow_for_olap_text](codec_zstd_allow_for_olap_text.md) %}

{% endnote %}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Codec `"zstd"` is supported only for [column-oriented](../concepts/datamodel/table.md#column-oriented-tables) tables.
9 changes: 9 additions & 0 deletions ydb/docs/en/core/_includes/only_allow_for_olap_note.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
{% if oss == true and backend_name == "YDB" %}

{% note alert %}

{% include [only_allow_for_olap_text](only_allow_for_olap_text.md) %}

{% endnote %}

{% endif %}
1 change: 1 addition & 0 deletions ydb/docs/en/core/_includes/only_allow_for_olap_text.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Supported only for [column-oriented](../concepts/datamodel/table.md#column-oriented-tables) tables.
9 changes: 9 additions & 0 deletions ydb/docs/en/core/_includes/only_allow_for_oltp_note.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
{% if oss == true and backend_name == "YDB" %}

{% note alert %}

{% include [only_allow_for_oltp_text](only_allow_for_oltp_text.md) %}

{% endnote %}

{% endif %}
1 change: 1 addition & 0 deletions ydb/docs/en/core/_includes/only_allow_for_oltp_text.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Supported only for [row-oriented](../concepts/datamodel/table.md#row-oriented-tables) tables.
2 changes: 2 additions & 0 deletions ydb/docs/en/core/concepts/datamodel/_includes/table.md
Original file line number Diff line number Diff line change
Expand Up @@ -185,6 +185,8 @@ In most cases, working with {{ ydb-short-name }} column-oriented tables is simil
+ Available in both the primary key and other columns: `Date`, `Datetime`, `Timestamp`, `Int32`, `Int64`, `Uint8`, `Uint16`, `Uint32`, `Uint64`, `Utf8`, `String`;
+ Available only in columns not included in the primary key: `Bool`, `Decimal`, `Double`, `Float`, `Int8`, `Int16`, `Interval`, `JsonDocument`, `Json`, `Uuid`, `Yson`.

* Column-oriented tables support column groups, but only for compression settings.

Let's recreate the "article" table, this time in column-oriented format, using the following YQL command:

```yql
Expand Down
Original file line number Diff line number Diff line change
@@ -1,11 +1,5 @@
# Changing column groups

{% if oss == true and backend_name == "YDB" %}

{% include [OLAP_not_allow_note](../../../../_includes/not_allow_for_olap_note.md) %}

{% endif %}

The mechanism of [column groups](../../../../concepts/datamodel/table.md#column-groups) allows for improved performance of partial row read operations by dividing the storage of table columns into several groups. The most commonly used scenario is to organize the storage of infrequently used attributes into a separate column group.

## Creating column groups {#creating-column-groups}
Expand Down Expand Up @@ -38,7 +32,18 @@ ALTER TABLE series_with_families
ALTER COLUMN release_date SET FAMILY family_small;
```

Using the `ALTER FAMILY` command, you can change the parameters of the column group. The code below changes the storage type to `hdd` for the `default` column group in the `series_with_families` table:
Using the `ALTER FAMILY` command, you can change the parameters of the column group.


### Changing storage type

{% if oss == true and backend_name == "YDB" %}

{% include [OLTP_only_allow_note](../../../../_includes/only_allow_for_oltp_note.md) %}

{% endif %}

The code below changes the storage type to `hdd` for the `default` column group in the `series_with_families` table:

```yql
ALTER TABLE series_with_families ALTER FAMILY default SET DATA "hdd";
Expand All @@ -50,4 +55,26 @@ Available types of storage devices depend on the {{ ydb-short-name }} cluster co

{% endnote %}

### Changing compression codec

The code below changes the compression codec to `lz4` for the `default` column group in the `series_with_families` table:

```yql
ALTER TABLE series_with_families ALTER FAMILY default SET COMPRESSION "lz4";
```

### Changing compression level of codec

{% if oss == true and backend_name == "YDB" %}

{% include [OLAP_only_allow_note](../../../../_includes/only_allow_for_olap_note.md) %}

{% endif %}

The code below changes the compression level of codec if it supports different compression levels for the `default` column group in the `series_with_families` table:

```yql
ALTER TABLE series_with_families ALTER FAMILY default SET COMPRESSION_LEVEL 5;
```

You can specify any parameters of a group of columns from the [`CREATE TABLE`](../create_table/index.md) command.
Original file line number Diff line number Diff line change
@@ -1,39 +1,80 @@
# Column groups

Columns of the same table can be grouped to set the following parameters:

* `DATA`: A storage device type for the data in this column group. Acceptable values: `ssd`, `rot`.

{% if oss == true and backend_name == "YDB" %}

{% include [not_allow_for_olap](../../../../_includes/not_allow_for_olap_note.md) %}
{% include [OLTP_only_allow_note](../../../../_includes/only_allow_for_oltp_note.md) %}

{% endif %}

Columns of the same table can be grouped to set the following parameters:
* `COMPRESSION`: A data compression codec. Acceptable values: `off`, `lz4`, `zstd`.

* `DATA`: A storage device type for the data in this column group. Acceptable values: `ssd`, `rot`.
* `COMPRESSION`: A data compression codec. Acceptable values: `off`, `lz4`.
{% if oss == true and backend_name == "YDB" %}

{% include [codec_zstd_allow_for_olap_note](../../../../_includes/codec_zstd_allow_for_olap_note.md) %}

By default, all columns are in the same group named `default`. If necessary, the parameters of this group can also be redefined.
{% endif %}

* `COMPRESSION_LEVEL` — compression level of codec if it supports different compression levels.

{% if oss == true and backend_name == "YDB" %}

{% include [OLAP_only_allow_note](../../../../_includes/only_allow_for_olap_note.md) %}

{% endif %}

By default, all columns are in the same group named `default`. If necessary, the parameters of this group can also be redefined, if they are not redefined, then predefined values are applied.

## Example

In the example below, for the created table, the `family_large` group of columns is added and set for the `series_info` column, and the parameters for the default group, which is set by `default` for all other columns, are also redefined.

```yql
CREATE TABLE series_with_families (
series_id Uint64,
title Utf8,
series_info Utf8 FAMILY family_large,
release_date Uint64,
PRIMARY KEY (series_id),
FAMILY default (
DATA = "ssd",
COMPRESSION = "off"
),
FAMILY family_large (
DATA = "rot",
COMPRESSION = "lz4"
)
);
```
{% list tabs %}

- Creating a row-oriented table

```sql
CREATE TABLE series_with_families (
series_id Uint64,
title Utf8,
series_info Utf8 FAMILY family_large,
release_date Uint64,
PRIMARY KEY (series_id),
FAMILY default (
DATA = "ssd",
COMPRESSION = "off"
),
FAMILY family_large (
DATA = "rot",
COMPRESSION = "lz4"
)
);
```

- Creating a column-oriented table

```sql
CREATE TABLE series_with_families (
series_id Uint64,
title Utf8,
series_info Utf8 FAMILY family_large,
release_date Uint64,
PRIMARY KEY (series_id),
FAMILY default (
COMPRESSION = "lz4"
),
FAMILY family_large (
COMPRESSION = "zstd",
COMPRESSION_LEVEL = 5
)
)
WITH (STORE = COLUMN);
```

{% endlist %}

{% note info %}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -234,6 +234,9 @@ When creating row-oriented tables, it is possible to specify:
* [Column groups](family.md).
* [Additional parameters](with.md).

For column-oriented tables, only [additional parameters](with.md) can be specified during creation.
When creating column-oriented tables, it is possible to specify:

* [Column groups](family.md).
* [Additional parameters](with.md).

{% endif %}
5 changes: 5 additions & 0 deletions ydb/docs/ru/core/_includes/codec_zstd_allow_for_olap_note.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
{% note alert %}

{% include [codec_zstd_allow_for_olap_text](codec_zstd_allow_for_olap_text.md) %}

{% endnote %}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Кодек `"zstd"` поддерживается только для [колоночных](../concepts/datamodel/table.md#column-oriented-tables) таблиц.
2 changes: 1 addition & 1 deletion ydb/docs/ru/core/_includes/not_allow_for_olap_text.md
Original file line number Diff line number Diff line change
@@ -1 +1 @@
Поддерживается только для [строковых](../concepts/datamodel/table.md#strokovye-tablicy) таблиц. Поддержка функциональности для [колоночных](../concepts/datamodel/table.md#column-tables) таблиц находится в разработке.
Поддерживается только для [строковых](../concepts/datamodel/table.md#row-oriented-tables) таблиц. Поддержка функциональности для [колоночных](../concepts/datamodel/table.md#column-oriented-tables) таблиц находится в разработке.
9 changes: 9 additions & 0 deletions ydb/docs/ru/core/_includes/only_allow_for_olap_note.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
{% if oss == true and backend_name == "YDB" %}

{% note alert %}

{% include [only_allow_for_olap_text](only_allow_for_olap_text.md) %}

{% endnote %}

{% endif %}
1 change: 1 addition & 0 deletions ydb/docs/ru/core/_includes/only_allow_for_olap_text.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Поддерживается только для [колоночных](../concepts/datamodel/table.md#column-oriented-tables) таблиц.
9 changes: 9 additions & 0 deletions ydb/docs/ru/core/_includes/only_allow_for_oltp_note.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
{% if oss == true and backend_name == "YDB" %}

{% note alert %}

{% include [only_allow_for_oltp_text](only_allow_for_oltp_text.md) %}

{% endnote %}

{% endif %}
1 change: 1 addition & 0 deletions ydb/docs/ru/core/_includes/only_allow_for_oltp_text.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Поддерживается только для [строковых](../concepts/datamodel/table.md#row-oriented-tables) таблиц.
1 change: 1 addition & 0 deletions ydb/docs/ru/core/concepts/datamodel/_includes/table.md
Original file line number Diff line number Diff line change
Expand Up @@ -221,6 +221,7 @@ CREATE TABLE article (
+ Доступно и в первичном ключе и в остальных колонках: `Date`, `Datetime`, `Timestamp`, `Int32`, `Int64`, `Uint8`, `Uint16`, `Uint32`, `Uint64`, `Utf8`, `String`;
+ Доступно только в колонках, не входящих в первичный ключ: `Bool`, `Decimal`, `Double`, `Float`, `Int8`, `Int16`, `Interval`, `JsonDocument`, `Json`, `Uuid`, `Yson`.

* В колоночных таблицах поддерживаются группы колонок, но пока это используется только для задания сжатия на колонках.

Повторим создание таблицы `article`, на этот раз в колоночной форме, с помощью следующей YQL-команды:

Expand Down
Original file line number Diff line number Diff line change
@@ -1,11 +1,5 @@
# Создание и изменение групп колонок

{% if oss == true and backend_name == "YDB" %}

{% include [OLAP_not_allow_note](../../../../_includes/not_allow_for_olap_note.md) %}

{% endif %}

Механизм {% if oss == true and backend_name == "YDB" %}[групп](../../../../concepts/datamodel/table.md#column-groups){% else %}групп{% endif %} колонок позволяет увеличить производительность операций неполного чтения строк путем разделения хранения колонок строковой таблицы на несколько групп. Наиболее часто используемый сценарий — организация хранения редко используемых атрибутов в отдельной группе колонок.


Expand All @@ -22,13 +16,13 @@ ALTER TABLE series_with_families ADD FAMILY family_small (

## Изменение групп колонок

При помощи команды `ALTER COLUMN` можно изменить группу колонок для указанной колонки. Приведенный ниже код для колонки `release_date` в таблице `series_with_families` сменит группу колонок на `family_small`.
При помощи команды `ALTER COLUMN` можно изменить группу колонок для указанной колонки. Приведённый ниже код для колонки `release_date` в таблице `series_with_families` сменит группу колонок на `family_small`.

```yql
ALTER TABLE series_with_families ALTER COLUMN release_date SET FAMILY family_small;
```

Две предыдущие команды можно объединить в один вызов `ALTER TABLE`. Приведенный ниже код создаст в таблице `series_with_families` группу колонок `family_small` и установит её для колонки `release_date`.
Две предыдущие команды можно объединить в один вызов `ALTER TABLE`. Приведённый ниже код создаст в таблице `series_with_families` группу колонок `family_small` и установит её для колонки `release_date`.

```yql
ALTER TABLE series_with_families
Expand All @@ -39,7 +33,17 @@ ALTER TABLE series_with_families
ALTER COLUMN release_date SET FAMILY family_small;
```

При помощи команды `ALTER FAMILY` можно изменить параметры группы колонок. Приведенный ниже код для группы колонок `default` в таблице `series_with_families` сменит тип хранилища на `hdd`:
При помощи команды `ALTER FAMILY` можно изменить параметры группы колонок.

### Изменение типа хранилища

{% if oss == true and backend_name == "YDB" %}

{% include [OLTP_only_allow_note](../../../../_includes/only_allow_for_oltp_note.md) %}

{% endif %}

Приведённый ниже код для группы колонок `default` в таблице `series_with_families` сменит тип хранилища на `hdd`:

```yql
ALTER TABLE series_with_families ALTER FAMILY default SET DATA "hdd";
Expand All @@ -51,4 +55,32 @@ ALTER TABLE series_with_families ALTER FAMILY default SET DATA "hdd";

{% endnote %}

### Изменение кодека сжатия

{% if oss == true and backend_name == "YDB" %}

{% include [codec_zstd_allow_for_olap_note](../../../../_includes/codec_zstd_allow_for_olap_note.md) %}

{% endif %}

Приведённый ниже код для группы колонок `default` в таблице `series_with_families` сменит кодек сжатия на `lz4`:

```yql
ALTER TABLE series_with_families ALTER FAMILY default SET COMPRESSION "lz4";
```

### Изменение уровня кодека сжатия

{% if oss == true and backend_name == "YDB" %}

{% include [OLAP_only_allow_note](../../../../_includes/only_allow_for_olap_note.md) %}

{% endif %}

Приведённый ниже код для группы колонок `default` в таблице `series_with_families` сменит уровень кодека сжатия, если он поддерживает различные уровни сжатия:

```yql
ALTER TABLE series_with_families ALTER FAMILY default SET COMPRESSION_LEVEL 5;
```

Могут быть указаны все параметры группы колонок, описанные в команде [`CREATE TABLE`](../create_table/secondary_index.md)
Loading

0 comments on commit 3f1aaf1

Please sign in to comment.