Skip to content

Commit

Permalink
Change Type Column
Browse files Browse the repository at this point in the history
  • Loading branch information
jaceklaskowski committed Jan 28, 2024
1 parent d8c10ad commit 944fa03
Show file tree
Hide file tree
Showing 8 changed files with 34 additions and 16 deletions.
23 changes: 8 additions & 15 deletions docs/change-data-feed/CDCReader.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,23 +48,16 @@ Used when:

`CDCReader` defines `_change_type` column name that represents the type of a data change.

`_change_type` is a [CDF virtual column](#CDC_COLUMNS_IN_DATA) and among the columns in the [CDF-aware read schema](CDCReaderImpl.md#cdcReadSchema).

`_change_type` is among the [cdcAttributes](#cdcAttributes)
Change Type | Command
------------|--------
[delete](#CDC_TYPE_DELETE_STRING) | [Delete](../commands/delete/DeleteCommand.md#rewriteFiles)
[insert](#CDC_TYPE_INSERT) | [WriteIntoDelta](../commands/WriteIntoDelta.md#write)
[update_postimage](#CDC_TYPE_UPDATE_POSTIMAGE) | [Update](../commands/update/UpdateCommand.md#withUpdatedColumns)
[update_preimage](#CDC_TYPE_UPDATE_PREIMAGE) | [Update](../commands/update/UpdateCommand.md#withUpdatedColumns)

Used when:
`_change_type` is a [CDF virtual column](#CDC_COLUMNS_IN_DATA) and among the columns in the [CDF-aware read schema](CDCReaderImpl.md#cdcReadSchema).

* `DeleteCommand` is requested to [rewriteFiles](../commands/delete/DeleteCommand.md#rewriteFiles) (with [Change Data Feed](index.md) enabled)
* `UpdateCommand` is requested to [withUpdatedColumns](../commands/update/UpdateCommand.md#withUpdatedColumns) (with [Change Data Feed](index.md) enabled to add `update_preimage` and `update_postimage` columns)
* `WriteIntoDelta` is requested to [write](../commands/WriteIntoDelta.md#write) (for `insert`s)
* `CDCReader` is requested for the [CDC_COLUMNS_IN_DATA](#CDC_COLUMNS_IN_DATA), the [cdcAttributes](#cdcAttributes)
* `CDCReaderImpl` is requested for the [cdcReadSchema](CDCReaderImpl.md#cdcReadSchema)
* `ClassicMergeExecutor` is requested to [writeAllChanges](../commands/merge/ClassicMergeExecutor.md#writeAllChanges) (with [Change Data Feed](index.md) enabled)
* `MergeOutputGeneration` is requested to [deduplicateCDFDeletes](../commands/merge/MergeOutputGeneration.md#deduplicateCDFDeletes) and [generateCdcAndOutputRows](../commands/merge/MergeOutputGeneration.md#generateCdcAndOutputRows)
* `CdcAddFileIndex` is requested for the [matching files](CdcAddFileIndex.md#matchingFiles)
* `TahoeRemoveFileIndex` is requested for the [matching files](TahoeRemoveFileIndex.md#matchingFiles)
* `TransactionalWrite` is requested to [performCDCPartition](../TransactionalWrite.md#performCDCPartition)
* `SchemaUtils` is requested to [normalizeColumnNames](../SchemaUtils.md#normalizeColumnNames)
`_change_type` is among the [cdcAttributes](#cdcAttributes).

### <span id="_commit_version"> Commit Version Column { #CDC_COMMIT_VERSION }

Expand Down
2 changes: 1 addition & 1 deletion docs/change-data-feed/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ Change Data Feed is enabled in batch and streaming queries using [readChangeFeed
.table("source")
```

`readChangeFeed` is used alongside the other CDC options:
`readChangeFeed` is used alongside the other CDF options:

* [startingVersion](../spark-connector/DeltaDataSource.md#CDC_START_VERSION_KEY)
* [startingTimestamp](../spark-connector/DeltaDataSource.md#CDC_START_TIMESTAMP_KEY)
Expand Down
4 changes: 4 additions & 0 deletions docs/constraints/Constraints.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
---
title: Constraints
---

# Constraints Utility

## <span id="getAll"> Extracting All Constraints
Expand Down
4 changes: 4 additions & 0 deletions docs/constraints/DeltaInvariantChecker.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
---
title: DeltaInvariantChecker
---

# DeltaInvariantChecker Unary Logical Operator

`DeltaInvariantChecker` is a `UnaryNode` ([Spark SQL]({{ book.spark_sql }}/logical-operators/UnaryNode/)).
Expand Down
4 changes: 4 additions & 0 deletions docs/constraints/DeltaInvariantCheckerExec.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
---
title: DeltaInvariantCheckerExec
---

# DeltaInvariantCheckerExec Unary Physical Operator

`DeltaInvariantCheckerExec` is an `UnaryExecNode` ([Spark SQL]({{ book.spark_sql }}/physical-operators/UnaryExecNode)) to [assert constraints](#doExecute).
Expand Down
4 changes: 4 additions & 0 deletions docs/constraints/DeltaInvariantCheckerStrategy.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
---
title: DeltaInvariantCheckerStrategy
---

# DeltaInvariantCheckerStrategy Execution Planning Strategy

`DeltaInvariantCheckerStrategy` is a `SparkStrategy` ([Spark SQL]({{ book.spark_sql }}/execution-planning-strategies/SparkStrategy/)) to [plan](#apply) a [DeltaInvariantChecker](DeltaInvariantChecker.md) unary logical operator (with constraints attached) into a [DeltaInvariantCheckerExec](DeltaInvariantCheckerExec.md) for execution.
Expand Down
5 changes: 5 additions & 0 deletions docs/constraints/index.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
---
hide:
- toc
---

# Table Constraints

Table constraints can be one of the following:
Expand Down
4 changes: 4 additions & 0 deletions docs/data-skipping/PrepareDeltaScan.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
---
title: PrepareDeltaScan
---

# PrepareDeltaScan Logical Optimization

`PrepareDeltaScan` is a [PrepareDeltaScanBase](PrepareDeltaScanBase.md).
Expand Down

0 comments on commit 944fa03

Please sign in to comment.