Skip to content

Commit

Permalink
Renaming beta spark plugin in docs
Browse files Browse the repository at this point in the history
  • Loading branch information
treff7es committed Jul 22, 2024
1 parent 1425fb7 commit 8e08da7
Show file tree
Hide file tree
Showing 6 changed files with 22 additions and 9 deletions.
2 changes: 1 addition & 1 deletion docs-website/filterTagIndexes.json
Original file line number Diff line number Diff line change
Expand Up @@ -562,7 +562,7 @@
}
},
{
"Path": "docs/metadata-integration/java/spark-lineage-beta",
"Path": "docs/metadata-integration/java/acryl-spark-lineage",
"imgPath": "img/logos/platforms/spark.svg",
"Title": "Spark",
"Description": "Spark is a data processing tool that enables fast and efficient processing of large-scale data sets using distributed computing.",
Expand Down
4 changes: 2 additions & 2 deletions docs-website/sidebars.js
Original file line number Diff line number Diff line change
Expand Up @@ -424,7 +424,7 @@ module.exports = {
},
{
type: "doc",
id: "metadata-integration/java/spark-lineage-beta/README",
id: "metadata-integration/java/acryl-spark-lineage/README",
label: "Spark",
},
//"docker/airflow/local_airflow",
Expand Down Expand Up @@ -886,7 +886,7 @@ module.exports = {
//"docs/how/graph-onboarding",
//"docs/demo/graph-onboarding",
//"metadata-integration/java/spark-lineage/README",
// "metadata-integration/java/spark-lineage-beta/README.md
// "metadata-integration/java/acryl-spark-lineage/README.md
// "metadata-integration/java/openlineage-converter/README"
//"metadata-ingestion-modules/airflow-plugin/README"
//"metadata-ingestion-modules/dagster-plugin/README"
Expand Down
8 changes: 4 additions & 4 deletions docs/lineage/openlineage.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ DataHub, now supports [OpenLineage](https://openlineage.io/) integration. With t

- **REST Endpoint Support**: DataHub now includes a REST endpoint that can understand OpenLineage events. This allows users to send lineage information directly to DataHub, enabling easy integration with various data processing frameworks.

- **[Spark Event Listener Plugin](https://datahubproject.io/docs/metadata-integration/java/spark-lineage-beta)**: DataHub provides a Spark Event Listener plugin that seamlessly integrates OpenLineage's Spark plugin. This plugin enhances DataHub's OpenLineage support by offering additional features such as PathSpec support, column-level lineage, patch support and more.
- **[Spark Event Listener Plugin](https://datahubproject.io/docs/metadata-integration/java/acryl-spark-lineage)**: DataHub provides a Spark Event Listener plugin that seamlessly integrates OpenLineage's Spark plugin. This plugin enhances DataHub's OpenLineage support by offering additional features such as PathSpec support, column-level lineage, patch support and more.

## OpenLineage Support with DataHub

Expand Down Expand Up @@ -73,7 +73,7 @@ The transport should look like this:
#### Known Limitations
With Spark and Airflow we recommend using the Spark Lineage or DataHub's Airflow plugin for tighter integration with DataHub.

- **[PathSpec](https://datahubproject.io/docs/metadata-integration/java/spark-lineage-beta/#configuring-hdfs-based-dataset-urns) Support**: While the REST endpoint supports OpenLineage messages, full [PathSpec](https://datahubproject.io/docs/metadata-integration/java/spark-lineage-beta/#configuring-hdfs-based-dataset-urns)) support is not yet available.
- **[PathSpec](https://datahubproject.io/docs/metadata-integration/java/acryl-spark-lineage/#configuring-hdfs-based-dataset-urns) Support**: While the REST endpoint supports OpenLineage messages, full [PathSpec](https://datahubproject.io/docs/metadata-integration/java/acryl-spark-lineage/#configuring-hdfs-based-dataset-urns)) support is not yet available.

- **Column-level Lineage**: DataHub's current OpenLineage support does not provide full column-level lineage tracking.
- etc...
Expand All @@ -83,10 +83,10 @@ DataHub's Spark Event Listener plugin enhances OpenLineage support by providing

#### How to Use

Follow the guides of the Spark Lineage plugin page for more information on how to set up the Spark Lineage plugin. The guide can be found [here](https://datahubproject.io/docs/metadata-integration/java/spark-lineage-beta)
Follow the guides of the Spark Lineage plugin page for more information on how to set up the Spark Lineage plugin. The guide can be found [here](https://datahubproject.io/docs/metadata-integration/java/acryl-spark-lineage/README.md)

## References

- [OpenLineage](https://openlineage.io/)
- [DataHub OpenAPI Guide](../api/openapi/openapi-usage-guide.md)
- [DataHub Spark Lineage Plugin](https://datahubproject.io/docs/metadata-integration/java/spark-lineage-beta)
- [DataHub Spark Lineage Plugin](https://datahubproject.io/docs/metadata-integration/java/acryl-spark-lineage/README.md)
2 changes: 1 addition & 1 deletion metadata-ingestion/docs/sources/databricks/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ The alternative way to integrate is via the Hive connector. The [Hive starter re

## Databricks Spark

To complete the picture, we recommend adding push-based ingestion from your Spark jobs to see real-time activity and lineage between your Databricks tables and your Spark jobs. Use the Spark agent to push metadata to DataHub using the instructions [here](../../../../metadata-integration/java/spark-lineage-beta/README.md#configuration-instructions-databricks).
To complete the picture, we recommend adding push-based ingestion from your Spark jobs to see real-time activity and lineage between your Databricks tables and your Spark jobs. Use the Spark agent to push metadata to DataHub using the instructions [here](../../../../metadata-integration/java/acryl-spark-lineage/README.md#configuration-instructions-databricks).

## Watch the DataHub Talk at the Data and AI Summit 2022

Expand Down
13 changes: 13 additions & 0 deletions metadata-integration/java/acryl-spark-lineage/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -346,10 +346,23 @@ Use Java 8 to build the project. The project uses Gradle as the build tool. To b
+
## Changelog
### Next
- Add Kafka emitter to emit lineage to kafka
- Add File emitter to emit lineage to file
- Upgrading OpenLineage to 1.18.0
- Renaming project to acryl-datahub-spark-lineage
- Supporting OpenLineage 1.17+ glue identifier changes
- Removing custom
-
### Version 0.2.14
- Fix warning about MeterFilter warning from Micrometer
### Version 0.2.13
- Add kafka emitter to emit lineage to kafka
### Version 0.2.12
- Silencing some chatty warnings in RddPathUtils
### Version 0.2.11
Expand Down
2 changes: 1 addition & 1 deletion metadata-integration/java/spark-lineage-legacy/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

:::note

This is our legacy Spark Integration which is replaced by [Acryl Spark Lineage](https://datahubproject.io/docs/metadata-integration/java/spark-lineage-beta)
This is our legacy Spark Integration which is replaced by [Acryl Spark Lineage](https://datahubproject.io/docs/metadata-integration/java/acryl-spark-lineage)

:::

Expand Down

0 comments on commit 8e08da7

Please sign in to comment.