-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Saved Objects] create
/bulkCreate
: add validation for custom ids
#105039
Comments
Pinging @elastic/kibana-core (Team:Core) |
cc @jportner Do you know what characters should be allowed for custom ids? I just know that it wouldn't be as simple as restricting to alphanum, as if I remember correctly, some special chars such as Also
|
Whoops! Makes sense 👍
We don't currently implement any restrictions, and if we did it would potentially be a breaking change. I think the only limiting factor is what the I did some rudimentary testing with dev tools, and the only characters that seem to cause problems when creating a single new document are
So, if 512 bytes is our absolute upper limit, and we also potentially have a space ID in the raw ES document (for legacy single-namespace types), perhaps we should limit space IDs and SO IDs each to 224 bytes in UTF-8 encoding (448 total). That gives us plenty of buffer for the other parts of the raw document ID, like the colon delimiters and the SO type.
No, I think that's an arbitrary limitation. Users with other native languages may want to use different characters like 鸡 😄 Edit: asked the ES team and confirmed that the only limitation on the ES side is that |
After a sync discussion with @jportner (and even if I just hate the idea of allowing the whole UTF-8 range in our ids), we came to the conclusion that ensuring a pattern was going to be an issue for BTW, as it would cause importing docs with 'now invalid' ids to no longer be possible. The scope of the issue will then only be to forbid empty ids (either |
@pgayvallet Should we extend the scope with |
For empty ids, any read attempt currently leads to an error when converting the raw doc. I would be very in favor of defining an actual allowed character set for ids (allowing the full UTF-8 range in what is supposed to be a technical ID is an aberration in my opinion), and if we do, starting by logging warnings on read would probably be the way to go, however AFAIK, these ids are not only technical, e.g for spaces, this is the actual used-inputed 'name' and path of the space, as @jportner pointed it out... |
We can encode |
The storage is not really the issue. ES supports that properly. it's more that imho a technical id is plain ascii, and that we discovered that what we call 'id` is effectively used as name or functional id, which seems like quite a poor design |
Scope of the issue reduced to only ban empty ids (empty strings). Having stricter rules on the id format would be considered a breaking change |
We're not currently performing any kind of validation when the consumer provides custom ids when invoking the
create
orbulkCreate
APIs. We're just generating a random id if the value was not provided (undefined
forcreate
andundefined
ornull
forbulkCreate
)kibana/src/core/server/saved_objects/service/lib/repository.ts
Lines 279 to 280 in 1965315
kibana/src/core/server/saved_objects/service/lib/repository.ts
Lines 389 to 391 in 1965315
The SO serializer does accept empty ids when converting the SO to its raw format, but will fail to do the opposite, during the prefix check, as the id is empty.
kibana/src/core/server/saved_objects/serialization/serializer.ts
Lines 239 to 241 in b2d36b8
This leads to data corruption, as creating an object with an empty id will fail to deserialize the doc, causing errors at runtime when accessing it, and during the SO migration.
So the primary goal is to forbid usage of empty ids. But while we're at it, I think we should add more complete validation against custom ids when provided, to enforce an allowed pattern.
I tried to look at our documentation to see if we already have a pattern for custom ids, but AFAIK we don't. Part of resolving the issue would therefore be to define the pattern we do want to support.
The text was updated successfully, but these errors were encountered: