store: Prevent store from panic after range object storage #3322

MegaEle · 2020-10-15T10:39:16Z

Signed-off-by: MegaEle ajmy789456123@gmail.com

I added CHANGELOG entry for this change.
Change is not relevant to the end user.

Changes

When the store uses the object store as the s3 interface, there is a probability that the loadChunks or loadSeries method panic will appear in the query. It is found that the s3 interface does not return complete data, and the store should return an error instead of panic

Verification

I compared it after build the modified code. Under the same circumstances, the store will not panic, and the query returns the following error like:
Mint: 1601021768826 Maxt: 1602734400000: rpc error: code = Aborted desc = fetch series for block 01EMMWANSPNASD41ER54W9B2WQ: preload chunks: read range for 0: can't read chunk for required length, required 2723780, actually 48762

Signed-off-by: MegaEle <ajmy789456123@gmail.com>

pracucci

Hi!

When the store uses the object store as the s3 interface, there is a probability that the loadChunks or loadSeries method panic will appear in the query. It is found that the s3 interface does not return complete data, and the store should return an error instead of panic

I would be interested into hearing about the issue. The protection you propose here is ok (unless will turn out to add too much complexity) but I'm not sure I understand the root cause of this issue, which is more interesting to me. Have you opened an issue with details about it?

pracucci · 2020-10-15T16:03:40Z

pkg/store/bucket.go

@@ -1908,7 +1909,11 @@ func (r *bucketIndexReader) loadSeries(ctx context.Context, ids []uint64, refetc
 	r.mtx.Unlock()

 	for i, id := range ids {
-		c := b[id-start:]
+		seriesOffset := id - start
+		if int(seriesOffset) > indexLength {


This just checks the offset, but the following code could overread anyway. Am I missing anything?

Hi, I found the following issue, we encountered the same issue.
#3057

When I debugged, I found that when using readIndexRange and readChunkRange, the length of the data is not necessarily equal to or longer than the required length of the data, but when we use it, we will set the offset according to the length, so there will be situations .

I use the s3 interface and the underlying ceph storage. I am still looking for ways to solve the root cause of the incorrect data length, but there is no progress. In addition, I hope the store will not panic in this case

I tried to modify the readChunkRange and readIndexRange methods, but I found that other upstream methods do not require the returned data length. So I only modified the place where panic occurs : )

@MegaEle do you use radosgw, perchance? I have seen the same thing a few times but never got to it. It's probably some kind of bug there in the API in that it doesn't follow the spec. Have you ever found a way to consistently reproduce it?

Not sure if the issue I just encountered is also relevant: #3339

I use Radosgw/ceph to confirm what @GiedriusS says

stale · 2020-12-20T00:52:27Z

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

GiedriusS · 2020-12-21T16:09:20Z

I think #3621 is a more reasonable defensive coding measure than this one because that other change adds a checker that is closer to the actual problem.

stale · 2021-02-21T15:42:57Z

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

GiedriusS · 2021-03-18T10:18:18Z

This has been fixed by #3795, thank you for your contribution!

MegaEle added 2 commits October 15, 2020 18:37

store: Prevent store from panic after range object storage

58aba88

Signed-off-by: MegaEle <ajmy789456123@gmail.com>

store: Fix slice bounds out of range

c7441af

pracucci reviewed Oct 15, 2020

View reviewed changes

stale bot added the stale label Dec 20, 2020

stale bot removed the stale label Dec 21, 2020

stale bot added the stale label Feb 21, 2021

Base automatically changed from master to main February 26, 2021 16:30

stale bot removed the stale label Feb 26, 2021

GiedriusS closed this Mar 18, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

store: Prevent store from panic after range object storage #3322

store: Prevent store from panic after range object storage #3322

MegaEle commented Oct 15, 2020

pracucci left a comment

pracucci Oct 15, 2020

MegaEle Oct 19, 2020

GiedriusS Oct 19, 2020

sevagh Oct 20, 2020

stale bot commented Dec 20, 2020

GiedriusS commented Dec 21, 2020

stale bot commented Feb 21, 2021

GiedriusS commented Mar 18, 2021

store: Prevent store from panic after range object storage #3322

store: Prevent store from panic after range object storage #3322

Conversation

MegaEle commented Oct 15, 2020

Changes

Verification

pracucci left a comment

Choose a reason for hiding this comment

pracucci Oct 15, 2020

Choose a reason for hiding this comment

MegaEle Oct 19, 2020

Choose a reason for hiding this comment

GiedriusS Oct 19, 2020

Choose a reason for hiding this comment

sevagh Oct 20, 2020

Choose a reason for hiding this comment

stale bot commented Dec 20, 2020

GiedriusS commented Dec 21, 2020

stale bot commented Feb 21, 2021

GiedriusS commented Mar 18, 2021