-
-
Notifications
You must be signed in to change notification settings - Fork 36
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
CdcAddFileIndex, ChangeDataFeedTableFeature and Table Properties
- Loading branch information
1 parent
e0697f0
commit affe799
Showing
9 changed files
with
209 additions
and
15 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
# SupportsRowIndexFilters | ||
|
||
`SupportsRowIndexFilters` is...FIXME |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,75 @@ | ||
# CdcAddFileIndex | ||
|
||
`CdcAddFileIndex` is...FIXME | ||
`CdcAddFileIndex` is a [TahoeBatchFileIndex](../TahoeBatchFileIndex.md) with the following: | ||
|
||
Property | Value | ||
---------|------ | ||
[Action Type](../TahoeBatchFileIndex.md#actionType) | `cdcRead` | ||
[addFiles](../TahoeBatchFileIndex.md#addFiles) | The [AddFile](../AddFile.md)s of the given [CDCDataSpecs](#filesByVersion) | ||
|
||
`CdcAddFileIndex` is used by [CDCReaderImpl](CDCReaderImpl.md) to [scanIndex](CDCReaderImpl.md#scanIndex). | ||
|
||
## Creating Instance | ||
|
||
`CdcAddFileIndex` takes the following to be created: | ||
|
||
* <span id="spark"> `SparkSession` | ||
* <span id="filesByVersion"> [AddFile](../AddFile.md)s by Version (`Seq[CDCDataSpec[AddFile]]`) | ||
* <span id="deltaLog"> [DeltaLog](../DeltaLog.md) | ||
* <span id="path"> `Path` | ||
* <span id="snapshot"> [SnapshotDescriptor](../SnapshotDescriptor.md) | ||
* [Row Index Filters](#rowIndexFilters) | ||
|
||
`CdcAddFileIndex` is created when: | ||
|
||
* `CDCReaderImpl` is requested for the [DataFrame with deleted and added rows](CDCReaderImpl.md#getDeletedAndAddedRows) and to [processDeletionVectorActions](CDCReaderImpl.md#processDeletionVectorActions) | ||
|
||
### Row Index Filters { #rowIndexFilters } | ||
|
||
??? note "SupportsRowIndexFilters" | ||
|
||
```scala | ||
rowIndexFilters: Option[Map[String, RowIndexFilterType]] = None | ||
``` | ||
|
||
`rowIndexFilters` is part of the [SupportsRowIndexFilters](../SupportsRowIndexFilters.md#rowIndexFilters) abstraction. | ||
|
||
`CdcAddFileIndex` is given Row Index Filters when [created](#creating-instance). | ||
|
||
## Input Files { #inputFiles } | ||
|
||
??? note "FileIndex" | ||
|
||
```scala | ||
inputFiles: Array[String] | ||
``` | ||
|
||
`inputFiles` is part of the `FileIndex` ([Spark SQL]({{ book.spark_sql }}/connectors/FileIndex#inputFiles)) abstraction. | ||
|
||
`inputFiles`...FIXME | ||
|
||
## Matching Files { #matchingFiles } | ||
|
||
??? note "TahoeFileIndex" | ||
|
||
```scala | ||
matchingFiles( | ||
partitionFilters: Seq[Expression], | ||
dataFilters: Seq[Expression]): Seq[AddFile] | ||
``` | ||
|
||
`matchingFiles` is part of the [TahoeFileIndex](../TahoeFileIndex.md#matchingFiles) abstraction. | ||
|
||
`matchingFiles`...FIXME | ||
|
||
## Partitions { #partitionSchema } | ||
|
||
??? note "FileIndex" | ||
|
||
```scala | ||
partitionSchema: StructType | ||
``` | ||
|
||
`partitionSchema` is part of the `FileIndex` ([Spark SQL]({{ book.spark_sql }}/connectors/FileIndex#partitionSchema)) abstraction. | ||
|
||
`partitionSchema` [cdcReadSchema](CDCReader.md#cdcReadSchema) for the [partitions](../Metadata.md#partitionSchema) of (the [Metadata](../SnapshotDescriptor.md#metadata) of) the given [SnapshotDescriptor](#snapshot). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
# ChangeDataFeedTableFeature | ||
|
||
`ChangeDataFeedTableFeature` is a [LegacyWriterFeature](../table-features/LegacyWriterFeature.md) with the following properties: | ||
|
||
Property | Value | ||
---------|------ | ||
[Name](../table-features/LegacyWriterFeature.md#name) | `changeDataFeed` | ||
[Minimum writer protocol version](../table-features/LegacyWriterFeature.md#minWriterVersion) | `4` | ||
|
||
`ChangeDataFeedTableFeature` is a [FeatureAutomaticallyEnabledByMetadata](../table-features/FeatureAutomaticallyEnabledByMetadata.md) that uses [delta.enableChangeDataFeed](../DeltaConfigs.md#enableChangeDataFeed) table property to control [Change Data Feed](index.md) feature. | ||
|
||
## metadataRequiresFeatureToBeEnabled { #metadataRequiresFeatureToBeEnabled } | ||
|
||
??? note "FeatureAutomaticallyEnabledByMetadata" | ||
|
||
```scala | ||
metadataRequiresFeatureToBeEnabled( | ||
metadata: Metadata, | ||
spark: SparkSession): Boolean | ||
``` | ||
|
||
`metadataRequiresFeatureToBeEnabled` is part of the [FeatureAutomaticallyEnabledByMetadata](../table-features/FeatureAutomaticallyEnabledByMetadata.md#metadataRequiresFeatureToBeEnabled) abstraction. | ||
|
||
`metadataRequiresFeatureToBeEnabled` is the value of [delta.enableChangeDataFeed](../DeltaConfigs.md#enableChangeDataFeed) table property in (the [configuration](../Metadata.md#configuration) of) the given [Metadata](../Metadata.md). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
title: Table Properties | ||
nav: | ||
- index.md | ||
- ... |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,90 @@ | ||
# Table Properties | ||
|
||
Delta Lake uses [DeltaConfigs](../DeltaConfigs.md) with the table properties of delta tables. | ||
|
||
Table properties start with `delta.` prefix. | ||
|
||
Table Properties can be set on delta tables using [ALTER TABLE SET TBLPROPERTIES](../commands/alter/AlterTableSetPropertiesDeltaCommand.md) or [CREATE TABLE](../commands/CreateDeltaTableCommand.md) SQL commands. | ||
|
||
```sql | ||
ALTER TABLE delta_demo | ||
SET TBLPROPERTIES (delta.enableChangeDataFeed = true) | ||
``` | ||
|
||
```sql | ||
CREATE TABLE delta_demo (id INT, name STRING, age INT) | ||
USING delta | ||
TBLPROPERTIES (delta.enableChangeDataFeed = true) | ||
``` | ||
|
||
Use `SHOW TBLPROPERTIES` SQL command to review the table properties of a delta table. | ||
|
||
```sql | ||
SHOW TBLPROPERTIES delta_demo; | ||
``` | ||
|
||
## SHOW TBLPROPERTIES | ||
|
||
Table properties can be displayed using `SHOW TBLPROPERTIES` SQL command: | ||
|
||
```sql | ||
SHOW TBLPROPERTIES <table_name> | ||
[(comma-separated properties)] | ||
``` | ||
|
||
--- | ||
|
||
```scala | ||
sql("SHOW TBLPROPERTIES delta.`/tmp/delta/t1`").show(truncate = false) | ||
``` | ||
|
||
```text | ||
+----------------------+-----+ | ||
|key |value| | ||
+----------------------+-----+ | ||
|delta.minReaderVersion|1 | | ||
|delta.minWriterVersion|2 | | ||
+----------------------+-----+ | ||
``` | ||
|
||
```scala | ||
sql("SHOW TBLPROPERTIES delta.`/tmp/delta/t1` (delta.minReaderVersion)").show(truncate = false) | ||
``` | ||
|
||
```text | ||
+----------------------+-----+ | ||
|key |value| | ||
+----------------------+-----+ | ||
|delta.minReaderVersion|1 | | ||
+----------------------+-----+ | ||
``` | ||
|
||
## ALTER TABLE SET TBLPROPERTIES | ||
|
||
Table properties can be set a value or unset using `ALTER TABLE` SQL command: | ||
|
||
```sql | ||
ALTER TABLE <table_name> SET TBLPROPERTIES (<key>=<value>) | ||
``` | ||
|
||
```sql | ||
ALTER TABLE table1 UNSET TBLPROPERTIES [IF EXISTS] ('key1', 'key2', ...); | ||
``` | ||
|
||
--- | ||
|
||
```text | ||
sql("ALTER TABLE delta.`/tmp/delta/t1` SET TBLPROPERTIES (delta.enableExpiredLogCleanup=true)") | ||
``` | ||
|
||
```scala | ||
sql("SHOW TBLPROPERTIES delta.`/tmp/delta/t1` (delta.enableExpiredLogCleanup)").show(truncate = false) | ||
``` | ||
|
||
```text | ||
+-----------------------------+-----+ | ||
|key |value| | ||
+-----------------------------+-----+ | ||
|delta.enableExpiredLogCleanup|true | | ||
+-----------------------------+-----+ | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters