httptransport: check for err before deferring resp.Body.Close() #1173

jan-zmeskal · 2021-02-05T13:11:20Z

There is quite a big story behind this PR. First, let me offer TLDR. If err isn't checked before referring to resp.Body, we might hit nil pointer dereference. See this SO post.

Here's the whole story. It all started with Red Hat Product Security team substantially extending data available in OVAL v2 streams. When this change hit the production, we started seeing issues with notifier deployed in our OpenShift cluster.

The first symptom that I found was that notifier simply could not process update operation created by certain stream. In notifier pod, it looked like this:

Basically all of the four processors would try to acquire lock on one update operation and none of them could actually acquire it. This caused the whole notifier to be forever stuck on one update operation.

I connected to our Clair DB and found out that there is an advisory lock sitting there.

I couldn't find out how long has it been there, but I started to monitor it and could see that it was still there after 30 minutes. No operation could take that long, so it must meant that the lock is stale.

As far as I can understand, these advisory locks get freed up when either a transaction or a session closes. So I came to the conclusion that some process must have died without gracefully tearing down whatever DB operation it was doing.

And indeed, I found out that our notifier crashed couple of hours ago and then OpenShift just spun up new pods:

So I guess you're getting the picture now:

Notifier pod is spun up but crashes during processing of update operation
As a result of the crash, new notifier pod is created
None of the notifier processors in the new pod can acquire lock on that update operation
Notifier is forever stuck on trying to acquire the lock

So the crash is the cause of it all. To my best knowledge, it happens here when we try to refer resp.Body. However, resp.Body is guaranteed to be non-nil only when err is nil, see here.

Hence my change. I suggest we first check the error returned by Do. If it's non-nil, we return as we'd do anyway. If it's nil, we can safely defer closing of the body as it's guaranteed to be non-nil.

Addendum: The crash seems to have appeared after with sent HTTP request with 273MB JSON body here. I still don't know if that's as problem of clair per se or if this is related to our deployment. Be as it may, this PR won't solve that issue. However, it should make sure that notifier won't get stuck in an infinite loop.

Signed-off-by: Jan Zmeskal <jzmeskal@redhat.com>

hdonnay

LGTM

I think both Louis and I have misunderstood the Body-and-err interaction a few different times.

httptransport: check for err before deferring resp.Body.Close()

f8adc0f

Signed-off-by: Jan Zmeskal <jzmeskal@redhat.com>

jan-zmeskal force-pushed the fix_nil_pointer branch from 7018f80 to f8adc0f Compare February 5, 2021 13:11

jan-zmeskal marked this pull request as ready for review February 5, 2021 14:00

jan-zmeskal requested review from alecmerdler, hdonnay, jzelinskie and ldelossa as code owners February 5, 2021 14:00

hdonnay added the Backport-4.0 label Feb 5, 2021

hdonnay approved these changes Feb 5, 2021

View reviewed changes

ldelossa merged commit df5e7f9 into quay:main Feb 5, 2021

jan-zmeskal mentioned this pull request Feb 16, 2021

Big POST request seems to be closed from indexer server side #1186

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

httptransport: check for err before deferring resp.Body.Close() #1173

httptransport: check for err before deferring resp.Body.Close() #1173

jan-zmeskal commented Feb 5, 2021 •

edited

Loading

hdonnay left a comment

httptransport: check for err before deferring resp.Body.Close() #1173

httptransport: check for err before deferring resp.Body.Close() #1173

Conversation

jan-zmeskal commented Feb 5, 2021 • edited Loading

hdonnay left a comment

Choose a reason for hiding this comment

jan-zmeskal commented Feb 5, 2021 •

edited

Loading