-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
move ProtoReader from lib/events to api/sessionrecording as Reader #50501
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some additional suggestions for cleanup/improvements.
We should be cautious here, and I'm fine if you want to decline to apply these, or even push them to a future PR so that this PR is more of a direct migration.
sizeBytes [Int64Size]byte | ||
messageBytes [MaxProtoMessageSizeBytes]byte |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two fields are only ever used in Read
, so they could be extracted to local variables instead of fields on Reader
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good find. I've moved these to be local to Read
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't claim to be an expert on these things, but moving a 64kb buffer into a function-local variable seems iffy to me. It can be easy to accidentally introduce something that causes go to decide to do a heap allocation for the buffer. And even if we're not doing a heap allocation for the buffer, it still means go needs to ensure and zero 64kb of stack space each time Read()
is called (potentially tens of thousands of times per second).
I think we should stick to reusing a persistent buffer for the lifetime of the reader.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds good to me and definitely not wanting to impact performance with this changed. I moved sizeBytes
as well and messageBytes
back to fields on the Reader
struct. They are still unexported, so easy to move in the future if something changes.
Normally, I would lean towards doing a direct migration. But in this case where we're making this importable then I appreciate the suggestions/improvements to make this easier for us to support consumers (plus easier to change the API now before releasing...). Thank you again for the really thorough reviews! |
|
UPDATE: Removed UPDATE2: Reviewing failing integration tests. UPDATE3: Integration test passed on re-run. I don't think I introduced any flake here, but I'm going to re-run it a few more times to be more confident. UPDATE4: Integration tests continue to pass without any more failures. Looks like first failure might have been due to node not coming up. |
|
||
outEvents = append(outEvents, out...) | ||
// combine all uploaded parts to create the session recording content | ||
var sessionRecordingContent bytes.Buffer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For increased confidence, I created a new pull request right off of master to demonstrate this test change + removing Reset
still passes: #50875
Hey @fspmarshall, Zac mentioned you would be a good person to take a look at these changes while he's on vacation. I'm happy to answer any questions. Ultimately, I'm wanting to use the session recording reader in another repository. Thank you in advance! |
slog.Int64("skipped-bytes", p.SkippedBytes), | ||
slog.Int64("skipped-events", p.SkippedEvents), | ||
slog.Int64("out-of-order-events", p.OutOfOrderEvents), | ||
slog.Int64("total-events", p.TotalEvents), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised that the linter isn't catching this(might need to file an issue upstream), but keys should be snake_case
slog.Int64("skipped-bytes", p.SkippedBytes), | |
slog.Int64("skipped-events", p.SkippedEvents), | |
slog.Int64("out-of-order-events", p.OutOfOrderEvents), | |
slog.Int64("total-events", p.TotalEvents), | |
slog.Int64("skipped_bytes", p.SkippedBytes), | |
slog.Int64("skipped_events", p.SkippedEvents), | |
slog.Int64("out_of_order_events", p.OutOfOrderEvents), | |
slog.Int64("total_events", p.TotalEvents), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. That is interesting about the linter. I'll look around to see if missing some configuration/sloglint version. Otherwise, I'll file an issue upstream.
Updated to use underscores instead of hyphens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks similar to this issue: go-simpler/sloglint#37 (go-simpler/sloglint#35)
if r.gzipReader != nil { | ||
return r.gzipReader.Close() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is is possible that gzipReader
is assigned after the check on line 129? For instance what if Read is being executed in another goroutine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're right that is possible. I'm going to move this conversation to the other pull request that is making implementation changes: #50883
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we separate out this PR into two separate PRs? One that renames/moves but changes no meaningful code, and another for the changes?
This logic is extremely performance sensitive and extremely difficult to debug when an issue comes up. We've put a pretty significant amount of dev hours into tuning and debugging of logic related to event decoding & streaming. We really don't want to accidentally miss a potential issue because it was hidden by the diff of the move, and if a problem does sneak in it will be important to be able to isolate it in the commit log.
Yep, I'll split this out. EDIT: going to add a commit at the start of this branch that is a pure copy. And then the follow up commits will show removals. |
0908a30
to
e9972f4
Compare
This commit is broken, does not pass tests, and does not even compile. This commit only exists to make diffs clearer as to what was changed. The following commands all show no diff: diff -y api/sessionrecording/sessionlog.go lib/events/sessionlog.go diff -y api/sessionrecording/stream.go lib/events/stream.go diff -y api/sessionrecording/stream_test.go lib/events/stream_test.go diff -y api/sessionrecording/testdata/corrupted-session \ lib/events/testdata/corrupted-session
This commit removes unneeded code from api/sessionrecording and updates the package name. It removes the same code from lib/events and updates usages of this new sessionrecording package.
This file only contains gzipReader, so rename for clarity.
The name no longer leaks the implementation. Went with Reader over SessionRecordingReader to prevent a stutter of writing `sessionrecording.SessionRecordingReader`.
TestReadCorruptedReading was moved to api/sessionrecording.
This is simply a constant we use to break an infinite loop.
These constants are not used by api/sessionrecording itself, so do not export.
e9972f4
to
94cf754
Compare
I've split this into two parts now. This pull request is focused only on moving Reader and renaming. To make it easier to review, the first commit is a simply/copy paste of relevant files without any changes (it does not compile, etc.) The next commit removes unneeded functions such as the writer, etc. And follow up commits have other changes related to renaming. #50883 has changes to implementation details and removal of APIs. Please let me know if you have any suggestions to make this is easier to review. I know there's a lot here, so happy to spend effort making this easier on reviewers. I'm hoping the addition of the first commit with simply copy/paste makes it easier to see what was removed during the initial move. |
Thanks for all your work on this @dustinspecker! If you plan on backporting this and #50883 I'd suggest combining both commits into a single PR on the release branches so they land at the same time. |
This pull request does the following:
ProtoReader
fromlib/events
toapi/sessionrecording
so that it may be imported by users to parse session recordings in Protobuf binary formatProtoReader
toReader
to avoid leaking implementation detail in nameThis pull request is focused on moving and renaming, while #50883 makes changes to implementation and the API.
This pull request should not be merged until #50883 is also ready to be merged, since we do not want to support this API as-is.
changelog: add api/sessionrecording.Reader to support parsing and reading session recordings