CT-2414: Add graph summaries to target directory output #7358

peterallenwebb · 2023-04-13T21:21:24Z

resolves #7357

Description

Adds a new graph summary output file named graph_summary.json to the target directory output.

Checklist

I have read the contributing guide and understand what's expected of me
I have signed the CLA
I have run this code in development and it appears to resolve the stated issue
This PR includes tests, or tests are not required/relevant for this PR
I have opened an issue to add/update docs, or docs changes are not required/relevant for this PR
I have run changie new to create a changelog entry

github-actions · 2023-04-13T21:21:37Z

Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide.

dbeatty10 · 2023-04-14T19:46:58Z

To keep the number of artifacts to a minimum, could we just write out graph_with_test_edges.json?

From there, consumers could filter out any resource types they don't want.

I know we started to discuss that idea here, but now that you have this pull request, it's probably useful to move that conversation to here.

dbeatty10 · 2023-04-25T18:32:47Z

@peterallenwebb Would you be open to just a single JSON artifact that is a superset of both artifacts?

That way, there's just one artifact to produce/consumer, and the end user can filter out any nodes they want to ignore.

Some benefits is that it would make the docs and implementation more simple.

peterallenwebb · 2023-04-25T19:25:29Z

@dbeatty10 I am just getting back to working on this, and not disregarding your earlier request, I promise. Using a single file will complicate the implementation, especially since the tests are added on one code path but not another, so I'm trying to figure out how to keep both the implementation and the output simple.

I think I am going to spin the snowplow stats off as a separate case, though.

…mary

dbeatty10 · 2023-04-26T00:11:59Z

No worries @peterallenwebb !

Spinning off the snowplow stats seems like nice effort that is complementary.

I was mainly wondering what would be lost if you just dropped this line:

linker.write_graph_summary("linked", out_stream, manifest)

Is it because graph_with_test_edges.json would not be a simple superset of graph.json?

peterallenwebb · 2023-04-26T02:21:21Z

@dbeatty10 There is some debate about whether the algorithm for adding the test edges is completely correct, and we know it has performance issues that we'd like to address, so my idea here is to make sure we know exactly what the graph looks like before and after that algorithm runs. That way we'll have a known starting point for refining and/or optimizing that algorithm.

I'm going to take one more pass at making this simpler.

iknox-fa

LGTM-- I bet we'll want to make some tweaks at some point (esp wrt test edges), but this seems like a very sane starting place.

aranke

Can we write a test here?

peterallenwebb · 2023-04-28T19:30:15Z

Added a test as suggested and included the invocation_id in the summary so that it could more easily be correlated with logging events and other artifacts of the same dbt invocation.

peterallenwebb · 2023-04-28T20:30:53Z

@boxysean You may find this interesting.

CT-2414: Add graph summaries to target directory output

ccf9b01

peterallenwebb requested review from a team and aranke April 13, 2023 21:21

cla-bot bot added the cla:yes label Apr 13, 2023

peterallenwebb added 2 commits April 25, 2023 14:23

CT-2414: Make graph representation more compact

ebeacc5

CT-2414: Add changelog entry

7da4400

peterallenwebb requested a review from a team as a code owner April 25, 2023 18:26

peterallenwebb requested a review from MichelleArk April 25, 2023 18:26

peterallenwebb added 3 commits April 25, 2023 16:31

CT-2414: Remove temporary diagnostic code.

0956772

Merge remote-tracking branch 'origin/main' into paw/CT-2414-graph-sum…

d7030da

…mary

CT-2414: Combine graphs into a single file

6936411

CT-2414: Simplify graph summary format.

5754e55

iknox-fa approved these changes Apr 27, 2023

View reviewed changes

aranke reviewed Apr 27, 2023

View reviewed changes

CT-2414: Add invocation id to summary, add unit test

31e86e9

peterallenwebb requested a review from a team as a code owner April 28, 2023 19:23

peterallenwebb requested review from emmyoop and removed request for a team April 28, 2023 19:23

peterallenwebb merged commit c56a9b2 into main Apr 28, 2023

peterallenwebb deleted the paw/CT-2414-graph-summary branch April 28, 2023 20:00

aranke mentioned this pull request May 9, 2023

merge from main into 1.5.latest #7576

Closed

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CT-2414: Add graph summaries to target directory output #7358

CT-2414: Add graph summaries to target directory output #7358

peterallenwebb commented Apr 13, 2023 •

edited by dbeatty10

Loading

github-actions bot commented Apr 13, 2023

dbeatty10 commented Apr 14, 2023

dbeatty10 commented Apr 25, 2023

peterallenwebb commented Apr 25, 2023

dbeatty10 commented Apr 26, 2023

peterallenwebb commented Apr 26, 2023

iknox-fa left a comment

aranke left a comment

peterallenwebb commented Apr 28, 2023

peterallenwebb commented Apr 28, 2023

CT-2414: Add graph summaries to target directory output #7358

CT-2414: Add graph summaries to target directory output #7358

Conversation

peterallenwebb commented Apr 13, 2023 • edited by dbeatty10 Loading

Description

Checklist

github-actions bot commented Apr 13, 2023

dbeatty10 commented Apr 14, 2023

dbeatty10 commented Apr 25, 2023

peterallenwebb commented Apr 25, 2023

dbeatty10 commented Apr 26, 2023

peterallenwebb commented Apr 26, 2023

iknox-fa left a comment

Choose a reason for hiding this comment

aranke left a comment

Choose a reason for hiding this comment

peterallenwebb commented Apr 28, 2023

peterallenwebb commented Apr 28, 2023

peterallenwebb commented Apr 13, 2023 •

edited by dbeatty10

Loading