Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(diagnostics) add diff endpoints #6131

Merged
merged 15 commits into from
Dec 10, 2024
Merged

feat(diagnostics) add diff endpoints #6131

merged 15 commits into from
Dec 10, 2024

Conversation

rainest
Copy link
Contributor

@rainest rainest commented Jun 4, 2024

What this PR does / why we need it:

Adds a diff endpoint to the diagnostic server and sets up infrastructure for DB mode updates to ship their diffs to the diagnostic server. This endpoint is available when sensitive config dumps are enabled, as it includes diffs of sensitive resources. We could add filters to exclude certs/passwords/etc. while retaining other items, but for the initial MVP, easiest to just require an opt in to sensitive diagnostic output.

Which issue this PR fixes:

Fix #5289

Special notes for your reviewer:

This requires Kong/go-database-reconciler#119. My original work in GDR was only sending error events; it should have been sending everything. This currently has a tmp commit against to update the module to the branch. That GDR PR needs to merged and tagged as fix release before merging this.

Per Kong/go-database-reconciler#120 I observed an odd bug with mysterious empty events 👻 Dunno where these are from, but trying to figure that out will take more time than is available. As a workaround, the event handler on the KIC side discards any event with no action, as such events should never be of interest.

Manual testing via running TEST_DATABASE_MODE=postgres KONG_TEST_CLUSTER=kind:kind GOFLAGS= dlv test -- -test.run SomeTestThatModifiesConfig. Almost any test will do, since they almost all modify config. You can either set up a watch curl -sv localhost:10256/debug/config/diff-report to see it over time or debugger pause after the diff send to see an individual diff. Output looks something like example.txt.

Unit testing isn't exhaustive, but covers most of the endpoint functionality. It checks the KIC-side behavior, not content, with the expectation that if content's broken, that's GDR's problem.

PR Readiness Checklist:

Complete these before marking the PR as ready to review:

  • the CHANGELOG.md release notes have been updated to reflect any significant (and particularly user-facing) changes introduced by this PR

@rainest rainest added this to the KIC v3.3.x milestone Jun 24, 2024
@rainest
Copy link
Contributor Author

rainest commented Jun 24, 2024

Moving into 3.3 as a stretch. IIRC this should be 50% done, just dropped from 3.2 between time constraints and low priority.

@pull-request-size pull-request-size bot added size/XL and removed size/L labels Aug 2, 2024
@rainest rainest force-pushed the feat/diff-db branch 3 times, most recently from 02c432c to a5e9832 Compare August 2, 2024 01:44
@rainest rainest marked this pull request as ready for review August 2, 2024 01:49
@rainest rainest requested a review from a team as a code owner August 2, 2024 01:49
@rainest rainest added the do not merge let the author merge this, don't merge for them. label Aug 2, 2024
@rainest rainest force-pushed the feat/diff-db branch 2 times, most recently from b85a7eb to 29c5926 Compare August 2, 2024 17:25
@rainest rainest removed the do not merge let the author merge this, don't merge for them. label Aug 2, 2024
@rainest
Copy link
Contributor Author

rainest commented Aug 3, 2024

Fix one bug, find another:

This has somehow broken the envtest for DB mode error event generation. That test should normally function by creating a consumer and tossing it at a mock admin API that always returns an error, which should in turn make GDR bubble up an error event through the result channel, which KIC will then turn into a Kubernetes Event.

Looking at the event handler only shows yet more of the wonderful 👻 events described in Kong/go-database-reconciler#120, though unfortunately I can't get the debugger to stop anywhere in the GDR portion, even at the low level of setting a watchpoint on the channel buffer itself. IDK what the heck's going there, since that's not something the envtest mock stuff should skip.

Tried to pin down the problem to either this or the GDR change:

so whatever causes it appears to be in this PR's new code, but IDK what. The new skip phantom events case isn't on the error handling side and shouldn't affect it, and git bisect isn't giving me good clues ☹️

Copy link

codecov bot commented Dec 5, 2024

Codecov Report

Attention: Patch coverage is 81.06509% with 32 lines in your changes missing coverage. Please review.

Project coverage is 77.7%. Comparing base (f343327) to head (2622177).
Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
internal/diagnostics/server.go 64.3% 22 Missing and 4 partials ⚠️
internal/diagnostics/diff.go 88.6% 4 Missing and 2 partials ⚠️
Additional details and impacted files
@@          Coverage Diff           @@
##            main   #6131    +/-   ##
======================================
  Coverage   77.7%   77.7%            
======================================
  Files        207     208     +1     
  Lines      24681   24833   +152     
======================================
+ Hits       19181   19314   +133     
- Misses      4523    4537    +14     
- Partials     977     982     +5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@randmonkey randmonkey assigned randmonkey and unassigned czeslavo Dec 5, 2024
@randmonkey randmonkey force-pushed the feat/diff-db branch 2 times, most recently from 3fb8cf3 to 0a51c00 Compare December 5, 2024 09:20
@pmalek
Copy link
Member

pmalek commented Dec 10, 2024

@randmonkey Made the change to remove the dependency on https://github.com/golang-collections/collections ( which is an 11 year old package) and used containers/list instead.

Also added some UTs.

PTAL b90c664. Feel free to merge this if you feel this looks good now.

rainest and others added 14 commits December 10, 2024 13:26
Rename the diagnostics.ConfigDump struct to ClientDiagnostic to reflect
that it now handles multiple types of diagnostic data (config dumps,
config diffs, and fallback status) from the Kong client.
Add a struct to hold diff history in the diagnostic server. It holds a
fixed number of diffs (5, arbitrarily) and provides functions to update
and view history by config hash.

Add a diagnostics function to generate diagnostic diff structs from GDR
event data.

Add a diagnostics update loop case to handle inbound diffs.

Call the diagnostic diff generator when processing an event and append
the diagnostic diff to the diff list to the DB mode event handler.
Dump sensitive config during integration, mostly to support debugging
the diff endpoint. While there's no diff diagnostic integration test,
having it available allows for easier interactive debugging of the
endpoint, or viewing diff states when running DB-backed tests.
Add timestamps and a list of available diffs to the diff endpoint.

Add unit tests for the diff endpoint.
Remove scaffolding comments and unfinished wishlist items. Add missing
doc comments.

Clean up some garbage the linter caught.
Co-authored-by: Jakub Warczarek <jakub.warczarek@konghq.com>
@randmonkey randmonkey enabled auto-merge (squash) December 10, 2024 13:38
@randmonkey randmonkey disabled auto-merge December 10, 2024 13:38
@randmonkey randmonkey enabled auto-merge (squash) December 10, 2024 13:50
@randmonkey
Copy link
Contributor

Code scanning results / CodeQL has been queued for 40+ mins. Let me close and reopen it to see if it fixed.

@randmonkey randmonkey closed this Dec 10, 2024
auto-merge was automatically disabled December 10, 2024 14:20

Pull request was closed

@randmonkey randmonkey reopened this Dec 10, 2024
@randmonkey randmonkey merged commit 4f6373f into main Dec 10, 2024
79 checks passed
@randmonkey randmonkey deleted the feat/diff-db branch December 10, 2024 23:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Output of entity changes in KIC's logs instead of Println in database-reconciler repo
5 participants