Fix EndTranscodingSession() call and potential race #2735
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Addresses #2732
Despite all efforts, I was yet unable to reproduce the error locally, however, I added a fix for related issue, and the fix which potentially should address this issue.
Findings
EndTranscodingSession()
call here was failing, because ofcontext cancelled
error. It's not clear why this context, which is, presumably, goes all the way down fromHandlePush()
, is causing that error on gRPC communication, but it's consistently reproducible. Fixed by changing it toContext.Background()
.EndTranscodingSession()
is called again here on stream end. There might be a race condition between receiving from thestop
channel and locking the mutex to cleanup the session on T. We had reports of noop transcoding jobs (empty segments?), could it be that it causedEndTranscodingSession()
to be called twice in a very quick succession, if abovecontext cancelled
issue doesn't happen on all systems? Addressed by removing the session id from the map right inEndTranscodingSession()
with mutex locked.