-
Notifications
You must be signed in to change notification settings - Fork 25.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Azure List by Prefix Bug #42713
Fix Azure List by Prefix Bug #42713
Conversation
* This check is necessary to make Azure more resilient. Without the type check listings that contain directories as well as blobs will throw `ClassCastException`s
Pinging @elastic/es-distributed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with the change, but I want to better understand what types of problems that this is addressing. Is this addressing the case where a user is manually adding different types of blobs (append or page blobs) to existing directories of the ES repo?
final String name = blobPath.substring(keyPath.length()); | ||
logger.trace(() -> new ParameterizedMessage("blob url [{}], name [{}], size [{}]", uri, name, properties.getLength())); | ||
blobsBuilder.put(name, new PlainBlobMetaData(name, properties.getLength())); | ||
if (blobItem instanceof CloudBlockBlob) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we use the more generic CloudBlob
here? Do you know under which conditions this is returning CloudBlobDirectory
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we use the more generic CloudBlob here?
Can't do it, we need to get the size off of the object and the generic blob doesn't have that on its interface.
Do you know under which conditions this is returning CloudBlobDirectory?
Sure, simple:
Whenever we do a list by prefix and it catches a directory we error out. E.g. if someone accidentally created a directory in a shard dir (I know it's a little far fetched for someone to manually created a dir somewhere but: 1. still we shouldn't completely shut down if they did. 2. Maybe it's not that far fetched given that people could be messing around with the nested dir setting of their repos).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't do it, we need to get the size off of the object and the generic blob doesn't have that on its interface.
The getProperties()
method is on CloudBlob
, not CloudBlockBlob
.
Note that we should also make the listBlobsByPrefix
method consistent (and have tests for it) across all implementations. AFAICS, for HDFS it also returns directories...
Perhaps better to keep this in #42653 and add the tests there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah sorry my bad, I mixed up the kinds of blobs :D Sure let's keep it in #42653 then (that one already has tests for this scenario :) that's how I found this :P). I'll weaken the cast there then + make the tests a little more explicit for this case -> closing here :)
Not really, those cases are either ignored currently or cleaned up in the shard data dirs. This handles the case where users would manually add directories named/positioned in a way that would have them get picked up by one of our listing calls. |
final String name = blobPath.substring(keyPath.length()); | ||
logger.trace(() -> new ParameterizedMessage("blob url [{}], name [{}], size [{}]", uri, name, properties.getLength())); | ||
blobsBuilder.put(name, new PlainBlobMetaData(name, properties.getLength())); | ||
if (blobItem instanceof CloudBlockBlob) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't do it, we need to get the size off of the object and the generic blob doesn't have that on its interface.
The getProperties()
method is on CloudBlob
, not CloudBlockBlob
.
Note that we should also make the listBlobsByPrefix
method consistent (and have tests for it) across all implementations. AFAICS, for HDFS it also returns directories...
Perhaps better to keep this in #42653 and add the tests there.
ClassCastException
s because they will show up in the listing asCloudBlobDirectory