Docpoc - Add backfill docs for Hubble #1362

amishas157 · 2025-03-06T21:10:49Z

Closes #1339

This PR:

Adds documentation for backfill using hubble
Pull out connecting to bigquery section out of analyst guide
Rename analyst guide to developer guide (and fix associated urls)

stellar-jenkins · 2025-03-06T21:21:35Z

Something went wrong with PR preview build please check

stellar-jenkins · 2025-03-06T21:30:53Z

Something went wrong with PR preview build please check

stellar-jenkins · 2025-03-06T21:36:30Z

Something went wrong with PR preview build please check

stellar-jenkins · 2025-03-06T21:40:20Z

Something went wrong with PR preview build please check

stellar-jenkins · 2025-03-06T21:42:11Z

Something went wrong with PR preview build please check

stellar-jenkins · 2025-03-06T21:50:20Z

Something went wrong with PR preview build please check

stellar-jenkins · 2025-03-06T21:57:11Z

Something went wrong with PR preview build please check

stellar-jenkins · 2025-03-06T22:08:35Z

Something went wrong with PR preview build please check

stellar-jenkins · 2025-03-06T22:20:39Z

Something went wrong with PR preview build please check

udpate ref udpate ref

stellar-jenkins · 2025-03-06T22:40:01Z

Preview is available here:
http://developers-pr1362.previews.kube001.services.stellar-ops.com

chowbao · 2025-03-07T16:35:41Z

Note that I think we might have to reorganize/rename backfill. I think it's gonna become an overloaded term in the near future with galexie backfilling, stellar-etl backfilling, rpc backfilling, etc...

sydneynotthecity · 2025-03-07T19:42:28Z

src/pages/index.mdx

 | [Data catalog](/docs/data/hubble/data-catalog) | View all Hubble data catalog information. | Learn |
-| [Admin guide](/docs/data/hubble/admin-guide) | A comprehensive guide that will teach you how to run your own Hubble analytics platform | Tutorial |
+| [Developer guide](/docs/data/hubble/developer-guide) | A comprehensive guide that will teach you how to run your own Hubble analytics platform | Tutorial |


The developer guide encompasses more than just running stellar-etl. It may be useful to add something about building custom data pipelines using Hubble as a source

sydneynotthecity · 2025-03-07T19:49:17Z

docs/data/hubble/developer-guide/backfill/README.mdx

This describes the scenario in which someone already has a data warehouse set up with full history loaded, but it does not cover cases where a developer wants to perform an initial backfill. I don't think our recommendation would be to use the js UDF. It would be more efficient for someone to either export the bigquery table into a format they could ingest, or connect via SDK and pull the data they needed in with a query.

Can you include this option on the page as well?

Hmm do you think this needs to go in this PR?

I was thinking the larger backfill would be a different body of work. Like in this comment backfill is gonna be super overloaded. Like the initial backfill should be its own section with options of using galexie, rpc, hubble, and 3rd party hosted data lake

Rephrasing: I think we should save the initial backfill doc for a separate doc and rename this to using UDF to <blank>

sydneynotthecity · 2025-03-07T19:52:06Z

docs/data/hubble/developer-guide/backfill/README.mdx

+
+- **Bug Fix:** You resolve a bug and need to re-ingest a specific data column.
+- **New Feature:** You add a new data column as part of a feature request and need to backfill data for the newly added column.
+- **Raw Data Extraction:** You want to use Hubble as a source for raw data (XDR columns) and extract only the required data columns. For scenarios 1 and 2, you can perform a backfill using Airflow and trigger a Directed Acyclic Graph (DAG) for past dates.


I like how you've outlined the different scenarios for when you need to perform a backfill. I think that's very clear. I think you should have a similar section that outlines the different options for how to backfill ie.
Options:

Run stellar-etl and trigger a DAG for past dates

Export data from Bigquery (via SDK connection + SQL or exporting the data into files)

JS UDF

This would allow you to link out to subpages that give more detail as necessary on these options

sydneynotthecity · 2025-03-07T19:54:44Z

docs/data/hubble/developer-guide/backfill/JS-UDF.mdx

+sidebar_position: 0
+---
+
+This document outlines methods to extract required fields from the XDR of raw data. We'll take the example of extracting the `fee_account_muxed` field from a transaction envelope (`tx_meta` XDR). However, this method can be adapted to other fields as well. It is worth noting that most users will not need to standup and run their own Hubble. The Stellar Development Foundation provides public access to the data through the public datasets and tables in GCP BigQuery. Instructions on how to access this data can be found in the [Connecting](../../developer-guide/connecting-to-bigquery/README.mdx) section.


I think this is missing a line that gives context as to why you would need to do this. Something simple like Hubble does not parse every single field available in raw XDR, but it does save the raw transaction meta in case you need to extract a field directly from the XDR.

amishas157 force-pushed the patch/improve-hubble-docs branch 2 times, most recently from ee2a717 to 34f6caa Compare March 6, 2025 21:29

amishas157 changed the title ~~Add docs for backfilling~~ Docpoc - Add backfill docs for Hubble Mar 6, 2025

amishas157 marked this pull request as ready for review March 6, 2025 22:19

amishas157 requested a review from a team March 6, 2025 22:20

amishas157 added 7 commits March 6, 2025 16:25

Add docs for backfilling

d43aca7

update

d14d9dd

update paths

05b57c4

udpate ref udpate ref

Add reference for connecting to bigquery

4f9e3b9

lint

353ceb1

update references

4d34bba

update ref

ad159e4

amishas157 force-pushed the patch/improve-hubble-docs branch from 9e58e0b to ad159e4 Compare March 6, 2025 22:25

chowbao approved these changes Mar 7, 2025

View reviewed changes

sydneynotthecity reviewed Mar 7, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Docpoc - Add backfill docs for Hubble #1362

Docpoc - Add backfill docs for Hubble #1362

amishas157 commented Mar 6, 2025 •

edited

Loading

stellar-jenkins commented Mar 6, 2025

stellar-jenkins commented Mar 6, 2025

stellar-jenkins commented Mar 6, 2025

stellar-jenkins commented Mar 6, 2025

stellar-jenkins commented Mar 6, 2025

stellar-jenkins commented Mar 6, 2025

stellar-jenkins commented Mar 6, 2025

stellar-jenkins commented Mar 6, 2025

stellar-jenkins commented Mar 6, 2025

stellar-jenkins commented Mar 6, 2025

chowbao commented Mar 7, 2025

sydneynotthecity Mar 7, 2025

sydneynotthecity Mar 7, 2025

chowbao Mar 7, 2025 •

edited

Loading

sydneynotthecity Mar 7, 2025

sydneynotthecity Mar 7, 2025

Docpoc - Add backfill docs for Hubble #1362

Are you sure you want to change the base?

Docpoc - Add backfill docs for Hubble #1362

Conversation

amishas157 commented Mar 6, 2025 • edited Loading

stellar-jenkins commented Mar 6, 2025

stellar-jenkins commented Mar 6, 2025

stellar-jenkins commented Mar 6, 2025

stellar-jenkins commented Mar 6, 2025

stellar-jenkins commented Mar 6, 2025

stellar-jenkins commented Mar 6, 2025

stellar-jenkins commented Mar 6, 2025

stellar-jenkins commented Mar 6, 2025

stellar-jenkins commented Mar 6, 2025

stellar-jenkins commented Mar 6, 2025

chowbao commented Mar 7, 2025

sydneynotthecity Mar 7, 2025

Choose a reason for hiding this comment

sydneynotthecity Mar 7, 2025

Choose a reason for hiding this comment

chowbao Mar 7, 2025 • edited Loading

Choose a reason for hiding this comment

sydneynotthecity Mar 7, 2025

Choose a reason for hiding this comment

sydneynotthecity Mar 7, 2025

Choose a reason for hiding this comment

amishas157 commented Mar 6, 2025 •

edited

Loading

chowbao Mar 7, 2025 •

edited

Loading