diff --git a/docs/OptimisticTransactionImpl.md b/docs/OptimisticTransactionImpl.md index 2dffc6f033..2caeb16199 100644 --- a/docs/OptimisticTransactionImpl.md +++ b/docs/OptimisticTransactionImpl.md @@ -698,11 +698,17 @@ registerPostCommitHook( hook: PostCommitHook): Unit ``` +??? warning "Procedure" + `registerPostCommitHook` is a procedure (returns `Unit`) so _what happens inside stays inside_ (paraphrasing the [former advertising slogan of Las Vegas, Nevada](https://idioms.thefreedictionary.com/what+happens+in+Vegas+stays+in+Vegas)). + `registerPostCommitHook` registers (_adds_) the given [PostCommitHook](post-commit-hooks/PostCommitHook.md) to the [postCommitHooks](#postCommitHooks) internal registry. +--- + `registerPostCommitHook` is used when: -* `OptimisticTransactionImpl` is created (and registers [CheckpointHook](checkpoints/CheckpointHook.md)) and [commitImpl](#commitImpl) (to register [GenerateSymlinkManifest](post-commit-hooks/GenerateSymlinkManifest.md)) +* `OptimisticTransactionImpl` is requested to [commitImpl](#commitImpl) +* `TransactionalWrite` is requested to [write data out](TransactionalWrite.md#writeFiles) ## setNewProtocolWithFeaturesEnabledByMetadata { #setNewProtocolWithFeaturesEnabledByMetadata } diff --git a/docs/auto-compaction/.pages b/docs/auto-compaction/.pages new file mode 100644 index 0000000000..d9799cbda6 --- /dev/null +++ b/docs/auto-compaction/.pages @@ -0,0 +1,4 @@ +title: Auto Compaction +nav: + - index.md + - ... diff --git a/docs/auto-compaction/AutoCompact.md b/docs/auto-compaction/AutoCompact.md new file mode 100644 index 0000000000..18029e9e53 --- /dev/null +++ b/docs/auto-compaction/AutoCompact.md @@ -0,0 +1,11 @@ +# AutoCompact + +`AutoCompact` is a [AutoCompactBase](AutoCompactBase.md). + +??? note "case object" + `AutoCompact` is a `case object` in Scala which means it is a class that has exactly one instance (itself). + A `case object` is created lazily when it is referenced, like a `lazy val`. + + Learn more in [Tour of Scala](https://docs.scala-lang.org/tour/singleton-objects.html). + +`AutoCompact` is [registered](../OptimisticTransactionImpl.md#registerPostCommitHook) when `TransactionalWrite` is requested to [write data out](../TransactionalWrite.md#writeFiles) and there are indeed new files added and it is not [Optimize](../commands/optimize/index.md) command. diff --git a/docs/auto-compaction/AutoCompactBase.md b/docs/auto-compaction/AutoCompactBase.md new file mode 100644 index 0000000000..169d7aa089 --- /dev/null +++ b/docs/auto-compaction/AutoCompactBase.md @@ -0,0 +1,60 @@ +# AutoCompactBase + +`AutoCompactBase` is an [extension](#contract) of the [PostCommitHook](../post-commit-hooks/PostCommitHook.md) abstraction for [post-commit hooks](#implementations) that [perform auto compaction](#run). + +## Implementations + +* [AutoCompact](AutoCompact.md) + +## Name { #name } + +??? note "PostCommitHook" + + ```scala + name: String + ``` + + `name` is part of the [PostCommitHook](../post-commit-hooks/PostCommitHook.md#name) abstraction. + +`name` is **Auto Compact**. + +## Executing Post-Commit Hook { #run } + +??? note "PostCommitHook" + + ```scala + run( + spark: SparkSession, + txn: OptimisticTransactionImpl, + committedVersion: Long, + postCommitSnapshot: Snapshot, + actions: Seq[Action]): Unit + ``` + + `run` is part of the [PostCommitHook](../post-commit-hooks/PostCommitHook.md#run) abstraction. + +`run` [determines the type of AutoCompact](#getAutoCompactType). + +`run` returns (and hence skips auto compacting) when [shouldSkipAutoCompact](#shouldSkipAutoCompact) is enabled. + +In the end, `run` [compactIfNecessary](#compactIfNecessary) with the following: + +* `delta.commit.hooks.autoOptimize` operation name +* `maxDeletedRowsRatio` unspecified (`None`) + +### compactIfNecessary { #compactIfNecessary } + +```scala +compactIfNecessary( + spark: SparkSession, + txn: OptimisticTransactionImpl, + postCommitSnapshot: Snapshot, + opType: String, + maxDeletedRowsRatio: Option[Double]): Seq[OptimizeMetrics] +``` + +`compactIfNecessary` [prepareAutoCompactRequest](AutoCompactUtils.md#prepareAutoCompactRequest). + +When [shouldCompact](AutoCompactRequest.md#shouldCompact) is disabled, `compactIfNecessary` returns no [OptimizeMetrics](../commands/optimize/OptimizeMetrics.md). + +Otherwise, with [shouldCompact](AutoCompactRequest.md#shouldCompact) turned on, `compactIfNecessary` [performs auto compaction](AutoCompact.md#compact). diff --git a/docs/auto-compaction/AutoCompactRequest.md b/docs/auto-compaction/AutoCompactRequest.md new file mode 100644 index 0000000000..2bac5623a4 --- /dev/null +++ b/docs/auto-compaction/AutoCompactRequest.md @@ -0,0 +1,3 @@ +# AutoCompactRequest + +`AutoCompactRequest` is...FIXME diff --git a/docs/auto-compaction/AutoCompactUtils.md b/docs/auto-compaction/AutoCompactUtils.md new file mode 100644 index 0000000000..ccb307666f --- /dev/null +++ b/docs/auto-compaction/AutoCompactUtils.md @@ -0,0 +1,21 @@ +# AutoCompactUtils + +## prepareAutoCompactRequest { #prepareAutoCompactRequest } + +```scala +prepareAutoCompactRequest( + spark: SparkSession, + txn: OptimisticTransactionImpl, + postCommitSnapshot: Snapshot, + partitionsAddedToOpt: Option[PartitionKeySet], + opType: String, + maxDeletedRowsRatio: Option[Double]): AutoCompactRequest +``` + +`prepareAutoCompactRequest`...FIXME + +--- + +`prepareAutoCompactRequest` is used when: + +* `AutoCompactBase` is requested to [compactIfNecessary](AutoCompactBase.md#compactIfNecessary) diff --git a/docs/auto-compaction/index.md b/docs/auto-compaction/index.md new file mode 100644 index 0000000000..fc6f8648ea --- /dev/null +++ b/docs/auto-compaction/index.md @@ -0,0 +1,3 @@ +# Auto Compaction + +**Auto Compaction** feature in Delta Lake uses [AutoCompact](AutoCompact.md) post-commit hook to [run](AutoCompactBase.md#run) at a [successful transaction commit](../OptimisticTransactionImpl.md#registerPostCommitHook). diff --git a/docs/post-commit-hooks/PostCommitHook.md b/docs/post-commit-hooks/PostCommitHook.md index e9e492a3ec..5cb6f7f02b 100644 --- a/docs/post-commit-hooks/PostCommitHook.md +++ b/docs/post-commit-hooks/PostCommitHook.md @@ -1,6 +1,6 @@ # PostCommitHook -`PostCommitHook` is an [abstraction](#contract) of [post-commit hooks](#implementations) to be [executed](#run) right after a successful [transaction commit](../OptimisticTransactionImpl.md#commit)). +`PostCommitHook` is an [abstraction](#contract) of [post-commit hooks](#implementations) to be [executed](#run) at the end of a successful [transaction commit](../OptimisticTransactionImpl.md#commit). ## Contract @@ -14,6 +14,7 @@ User-friendly name of the hook (for error reporting) See: +* [AutoCompactBase](../auto-compaction/AutoCompactBase.md#name) * [CheckpointHook](../checkpoints/CheckpointHook.md#name) Used when: @@ -32,8 +33,12 @@ run( committedActions: Seq[Action]): Unit ``` +??? warning "Procedure" + `run` is a procedure (returns `Unit`) so _what happens inside stays inside_ (paraphrasing the [former advertising slogan of Las Vegas, Nevada](https://idioms.thefreedictionary.com/what+happens+in+Vegas+stays+in+Vegas)). + See: +* [AutoCompactBase](../auto-compaction/AutoCompactBase.md#run) * [CheckpointHook](../checkpoints/CheckpointHook.md#run) Used when: @@ -42,5 +47,8 @@ Used when: ## Implementations +* [AutoCompactBase](../auto-compaction/AutoCompactBase.md) * [CheckpointHook](../checkpoints/CheckpointHook.md) * [GenerateSymlinkManifestImpl](GenerateSymlinkManifest.md) +* `IcebergConverterHook` +* `UpdateCatalogBase` diff --git a/mkdocs.yml b/mkdocs.yml index 08b09fb31e..20af687337 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -152,6 +152,7 @@ nav: - Features: - features/index.md - ... | append-only-tables/**.md + - ... | auto-compaction/**.md - ... | change-data-feed/**.md - ... | table-valued-functions/**.md - ... | check-constraints/**.md