Adds validations for Saved Object types when calling `create` or `bulkCreate`. #118969

lukeelmers · 2021-11-17T23:41:42Z

Summary

This adds a schemas option to SavedObjectType that will allow consumers to provide a validation schema for the attributes of their saved object type. This schema is provided per-version similar to migration functions.

coreSetup.savedObjects.registerType({
  name: 'my-type',
  hidden: false,
  namespaceType: 'agnostic',
  mappings: {
    properties: {
      a: { type: 'integer' },
      b: { type: 'text' },
      c: { type: 'boolean' },
    },
  },
  schemas: {
    '8.0.0': schema.object({
      a: schema.number(),
      b: schema.string(),
    }),
    '8.1.0': schema.object({
      a: schema.number(),
      b: schema.string(),
      c: schema.maybe(schema.boolean()),
    })
  }
});

This is not a full implementation of SO type validation; there are a few caveats:

Validation is only enforced on create and bulkCreate ATM, because for update operations users are not guaranteed to provide all required fields, which makes strict validation much more difficult. More discussion on this in the original issue.
When a new object is added via create and bulkCreate, SO repository does an on-the-fly migration on the object as a convenience. Since objects could be coming from any version, we decided to only run validations after the migrations are complete in this initial implementation.
- This means that even though consumers can provide schemas for multiple Kibana versions, we will not yet be validating objects against any of those schemas other than the one for the current kibanaVersion.

A more complete solution would be to run validations from within the migration algorithm itself so that each document could be validated before it is passed to the migration function for the current version... But this is something that would be much longer term as it would change the current behavior of migrations which are permissive and can selectively ignore invalid data: #104088 (comment)

Plugin API Changes

Saved Objects now support optional validation schemas using @kbn/config-schema.

When a validation schema is provided during type registration, it will be used to validate subsequent calls to the SavedObjectClient's create and bulkCreate APIs. Validation schemas are not currently enforced on update operations.

We recommend adding a new validation schema any time attributes are added, removed, or changed in your Saved Object type, as they help protect data integrity by preventing users from inadvertently importing malformed objects.

Usage example:

import { SavedObjectsValidationMap } from 'src/core/server';

const validationMap: SavedObjectsValidationMap = {
  '8.0.0': schema.object({
    foo: schema.string(),
  }),
  '8.1.0': schema.object({
    foo: schema.string({
      minLength: 2,
      validate(value) {
        if (!/^[a-z]+$/.test(value)) {
          return 'must be lowercase letters only';
        }
      }
    }),
  }),
};

coreSetup.savedObjects.registerType({
  name: 'my-type',
  hidden: false,
  namespaceType: 'agnostic',
  mappings: {
    properties: {
      foo: { type: 'text' },
    },
  },
  migrations: {
    '8.1.0': (doc) => ({ ...doc, attributes: { ...doc.attributes, foo: doc.attributes.foo.toLowerCase() } }),
  },
  schemas: validationMap,
});

lukeelmers

self-review

lukeelmers · 2021-11-18T23:18:54Z

src/core/server/saved_objects/validation/validator.ts

+ *   '2.1.0': (data) => {
+ *     if (typeof data.bar !== 'string') {
+ *       throw new Error(`[bar]: expected value of type [string] but got [${typeof data.bar}]`);
+ *     }
+ *     if (typeof data.foo !== 'string' && typeof data.foo !== 'boolean') {
+ *       throw new Error(`[foo]: expected value of type [string,boolean] but got [${typeof data.foo}]`);
+ *     }
+ *   }


I thought it would be a good idea to allow for custom validation functions, similar to what we do for http routes, because it feels like an inevitability that someone would eventually ask to do something that can't be handled by @kbn/config-schema.

However, implementing the custom functions just raised more questions that I'd like to get others' thoughts on:

IMO a validation function isn't the right place for folks to actually change the contents of the objects, so for now I structured the validation functions differently from how they are set up for http routes... I didn't include a factory for returning ok(value) or badRequest(). It's just a function that exists to throw errors, and otherwise doesn't return. It doesn't let you modify or do anything with the data.

Is it worth trying to follow the same factory pattern for consistency and keeping safeguards to prevent mutation, or do we prefer this simpler approach?

Do we think users need access to anything other than the attributes? Currently the SO repository just passes attributes in. But there may be situations for custom functions where it is useful to have more info about the object.

A more future-proof alternative might be providing ({ attributes }) to the validation functions, so that we can add more later if it proves to be necessary, WDYT?

I don't have an opinion on (1) yet.

(2) I think we should rather pass in the entire saved object, would there be any downside? Might not be a strong use case, but theoretically you could want to do validation on the references e.g. a dashboard can't have references to a config saved object type or on the updated_at field, e.g. can't set a date in the future.

Is it worth trying to follow the same factory pattern for consistency and keeping safeguards to prevent mutation, or do we prefer this simpler approach

Ihmo let's KISS for now, and just document that custom validation function should not be used to mutate the data. If at some point we discover that API consumers are doing, or planning to do, so, we can then put real safeguards.

(2) I think we should rather pass in the entire saved object, would there be any downside?

I agree that it would probably be helpful to allow owners to validate more than the attributes, BUT this means that custom validation functions would be more powerful than schema validation, which is an idea I hate to be honest, as ideally config-schema validation should always be favored unless there is a very specific reason to use custom validation instead. This is what we did for route validation for instance.

So if we think that it make sense to validate more than the attributes (which I think I agree on), we should find a way to do so also when using the by-schema validation? wdyt?

yeah I agree. I think the kind of validation I describe is related to the business domain and should ideally live in a layer one higher than saved objects, e.g. a service exposed to other plugins or in the http controller. In such a layer you could do async validation like fetching other saved objects before accepting a create.

So an alternative could be to make sure we have schema validation in core for "root properties" like references and updated_at and then keep only validation on attributes. And then if someone needs more powerful validation the answer is "build your own http API" like we anyway already recommend.

I was thinking when all saved objects are hidden: true saved object validation might become obsolete, but we would always need import/export validation and this would skip any business logic in other layers. I still feel a bit uneasy about import, you wouldn't always apply the same validation as on a create since the import includes other related objects, and it becomes quite messy to enforce the same business logic on an HTTP API as on a batch of imported objects. One potential solution could be to "sign" exports with e.g. a hash of it's contents so that we know it wasn't tampered with and can safely import it, but it also removes the usefulness of this API to some extent.

So an alternative could be to make sure we have schema validation in core for "root properties" like references and updated_at and then keep only validation on attributes. And then if someone needs more powerful validation the answer is "build your own http API" like we anyway already recommend.

That would be aligned with our recommendations, so I'd say it make sense.

I was thinking when all saved objects are hidden: true saved object validation might become obsolete, but we would always need import/export validation and this would skip any business logic in other layers

When all types will have their own validation layer directly implemented in their services / endpoints, I think that import/export would be the only place remaining owned by core where we'll want to perform validation yes. Once this is the case, I imagine we could still allow owners to register a specific validation function / hook for this with their types, to be used specifically when validating during import or export?

One potential solution could be to "sign" exports with e.g. a hash of it's contents so that we know it wasn't tampered with and can safely import it, but it also removes the usefulness of this API to some extent.

I think quite a few consumers are currently tampering or manually creating export files, so I'm not sure if we could introduce such thing without it being a breaking change. Plus, given that I think we'll want validation for import/export regardless, I'm not sure this would bring a significant plus value, as least for validation.

Ihmo let's KISS for now, and just document that custom validation function should not be used to mutate the data.

Sounds good, I hate the idea of someone abusing this as a hook to rewrite attributes, but I also think a wait-and-see approach is reasonable here.

So an alternative could be to make sure we have schema validation in core for "root properties" like references and updated_at and then keep only validation on attributes

This sounds like a good middle ground, but to keep us future proof if we change our minds, I'll still update the custom validation functions to take an object with a single attributes key.

I went ahead and added a full schema validation for the "root properties" that gets applied as well, so we are now validating the whole object, not just attributes.

src/core/server/saved_objects/validation/validator.ts

lukeelmers · 2021-11-18T23:26:24Z

src/core/server/saved_objects/validation/validator.ts

+
+  private validateFunction(data: A, validateFn: SavedObjectsValidationFunction) {
+    try {
+      validateFn({ ...data }); // shallow clone to prevent mutation


As mentioned in my other comment, maybe this should be future-proofed?

Suggested change

validateFn({ ...data }); // shallow clone to prevent mutation

validateFn({ attributes: { ...data } }); // shallow clone to prevent mutation

+1 to future-proofing from the get go

shallow clone does not really prevent mutation, e.g

const data = { foo: { bar: 12 } }; const shallow = { ...data }; shallow.foo.bar = 7; data.foo.bar // <== 7

(and having attributes with nested properties is quite common)

So as I said on a previous comment, I think we should either KISS and do nothing for now, or do it properly via a proper deepCopy or another mutation-prevention pattern.

shallow clone does not really prevent mutation

Yeah I know, I was trying to hit a middle ground between the perf hit of a deep clone while still providing some protection against mutation. But point taken about nested properties being common.

For now I'll go with KISS as discussed in the thread above. If we spot anyone abusing this, we can come back and deep clone.

Yeah I know, I was trying to hit a middle ground between the perf hit of a deep clone while still providing some protection against mutation.

The problem with this approach that is implicit. validateFn({ attributes: Object.freeze(data) } }); will throw an error, so Kibana devs can see the mutations are explicitly forbidden.

elasticmachine · 2021-11-19T00:49:26Z

Pinging @elastic/kibana-core (Team:Core)

src/core/server/saved_objects/types.ts

TinaHeiligers

Overall, it's looking good as an initial implementation. I still wish we could figure out how to add validation against updates though.

TinaHeiligers · 2021-11-22T22:12:45Z

src/core/server/saved_objects/validation/validator.ts

+
+  private validateFunction(data: A, validateFn: SavedObjectsValidationFunction) {
+    try {
+      validateFn({ ...data }); // shallow clone to prevent mutation


+1 to future-proofing from the get go

src/core/server/saved_objects/validation/validator.ts

TinaHeiligers · 2021-11-22T23:11:25Z

src/core/server/saved_objects/service/lib/repository.ts

+  /** Validate a migrated doc against the registered saved object type's schema. */
+  private validateObjectAttributes<T>(type: string, doc: SavedObjectSanitizedDoc<T>) {
+    const savedObjectType = this._registry.getType(type);
+    if (!savedObjectType?.schemas) {


Shouldn't we log a warning if there isn't a validation schema declared? That way, we'll have some way of monitoring migration failures (or any other failures from malformed SO creation actions) over time and address the critical ones that cause issues.

If we do that, we need to enable it in dev mode only, as it's a developer warning and not a production one.

It could be very verbose though, and I'm not sure it would replace the usual meta issue pinging all the type owners anyway?

It could be very verbose though, and I'm not sure it would replace the usual meta issue pinging all the type owners anyway?

Verbosity is what we want in debug mode but you're right, we don't know if it will help in the long run or be more "noise" than we can reasonably sift through.

How are we going to track the SO without validation? How soon do we want to get other teams to cover SO with validation?

Considering that with the latest updates to this PR we are now performing validation on the whole SO regardless of whether any validations have been registered for the attributes, IMO it is okay to do the usual pinging of type owners so that they can decide how to prioritize adding their own validations for the actual attributes. I think for some teams this will likely be a much larger concern than for others, so I don't think we necessarily need to put a timetable on it yet...

src/core/server/saved_objects/service/lib/repository.ts

src/core/server/saved_objects/validation/validator.ts

pgayvallet · 2021-11-23T08:09:45Z

src/core/server/saved_objects/service/lib/repository.ts

+  /** Validate a migrated doc against the registered saved object type's schema. */
+  private validateObjectAttributes<T>(type: string, doc: SavedObjectSanitizedDoc<T>) {
+    const savedObjectType = this._registry.getType(type);
+    if (!savedObjectType?.schemas) {


If we do that, we need to enable it in dev mode only, as it's a developer warning and not a production one.

It could be very verbose though, and I'm not sure it would replace the usual meta issue pinging all the type owners anyway?

pgayvallet · 2021-11-23T08:12:43Z

src/core/server/saved_objects/service/lib/repository.ts

+    const validator = new SavedObjectsTypeValidator({
+      type,
+      validationMap: savedObjectType.schemas,
+    });


NIT/optional: do we really need to instantiate this on every call to validateObjectAttributes? can't we instantiate a validator per type when the repository is created?

src/core/server/saved_objects/types.ts

pgayvallet · 2021-11-23T08:29:50Z

src/core/server/saved_objects/validation/validator.ts

+ *   '2.1.0': (data) => {
+ *     if (typeof data.bar !== 'string') {
+ *       throw new Error(`[bar]: expected value of type [string] but got [${typeof data.bar}]`);
+ *     }
+ *     if (typeof data.foo !== 'string' && typeof data.foo !== 'boolean') {
+ *       throw new Error(`[foo]: expected value of type [string,boolean] but got [${typeof data.foo}]`);
+ *     }
+ *   }


Is it worth trying to follow the same factory pattern for consistency and keeping safeguards to prevent mutation, or do we prefer this simpler approach

Ihmo let's KISS for now, and just document that custom validation function should not be used to mutate the data. If at some point we discover that API consumers are doing, or planning to do, so, we can then put real safeguards.

(2) I think we should rather pass in the entire saved object, would there be any downside?

I agree that it would probably be helpful to allow owners to validate more than the attributes, BUT this means that custom validation functions would be more powerful than schema validation, which is an idea I hate to be honest, as ideally config-schema validation should always be favored unless there is a very specific reason to use custom validation instead. This is what we did for route validation for instance.

So if we think that it make sense to validate more than the attributes (which I think I agree on), we should find a way to do so also when using the by-schema validation? wdyt?

src/core/server/saved_objects/validation/validator.ts

pgayvallet · 2021-11-23T08:37:09Z

src/core/server/saved_objects/validation/validator.ts

+
+  private validateFunction(data: A, validateFn: SavedObjectsValidationFunction) {
+    try {
+      validateFn({ ...data }); // shallow clone to prevent mutation


shallow clone does not really prevent mutation, e.g

const data = { foo: { bar: 12 } }; const shallow = { ...data }; shallow.foo.bar = 7; data.foo.bar // <== 7

(and having attributes with nested properties is quite common)

So as I said on a previous comment, I think we should either KISS and do nothing for now, or do it properly via a proper deepCopy or another mutation-prevention pattern.

src/core/server/saved_objects/validation/validator.ts

lukeelmers

Okay - the last week has been crazy, but was finally able to pick this back up.

src/core/server/saved_objects/validation/validator.ts

lukeelmers · 2021-12-03T23:31:46Z

src/core/server/saved_objects/validation/validator.ts

+
+  private validateFunction(data: A, validateFn: SavedObjectsValidationFunction) {
+    try {
+      validateFn({ ...data }); // shallow clone to prevent mutation


shallow clone does not really prevent mutation

Yeah I know, I was trying to hit a middle ground between the perf hit of a deep clone while still providing some protection against mutation. But point taken about nested properties being common.

For now I'll go with KISS as discussed in the thread above. If we spot anyone abusing this, we can come back and deep clone.

lukeelmers · 2021-12-03T23:39:53Z

src/core/server/saved_objects/validation/validator.ts

+ *   '2.1.0': (data) => {
+ *     if (typeof data.bar !== 'string') {
+ *       throw new Error(`[bar]: expected value of type [string] but got [${typeof data.bar}]`);
+ *     }
+ *     if (typeof data.foo !== 'string' && typeof data.foo !== 'boolean') {
+ *       throw new Error(`[foo]: expected value of type [string,boolean] but got [${typeof data.foo}]`);
+ *     }
+ *   }


Ihmo let's KISS for now, and just document that custom validation function should not be used to mutate the data.

Sounds good, I hate the idea of someone abusing this as a hook to rewrite attributes, but I also think a wait-and-see approach is reasonable here.

So an alternative could be to make sure we have schema validation in core for "root properties" like references and updated_at and then keep only validation on attributes

This sounds like a good middle ground, but to keep us future proof if we change our minds, I'll still update the custom validation functions to take an object with a single attributes key.

src/core/server/saved_objects/validation/validator.ts

src/core/server/saved_objects/types.ts

src/core/server/saved_objects/service/lib/repository.ts

lukeelmers · 2021-12-13T23:51:05Z

@elasticmachine merge upstream

src/core/server/saved_objects/validation/types.ts

src/core/server/saved_objects/validation/validator.ts

lukeelmers · 2021-12-28T22:19:47Z

@elasticmachine merge upstream

lukeelmers · 2022-01-04T04:04:28Z

@elasticmachine merge upstream

pgayvallet

Implementation looking good. A bunch of comments and NITs, but overall LGTM

src/core/server/server.api.md

src/core/server/saved_objects/service/lib/repository.test.ts

src/core/server/saved_objects/validation/integration_tests/validator.test.ts

src/core/server/saved_objects/validation/schema.ts

src/core/server/saved_objects/validation/schema.test.ts

src/core/server/saved_objects/validation/types.ts

pgayvallet · 2022-01-04T13:56:28Z

src/core/server/saved_objects/validation/validator.ts

+      const validationSchema = createSavedObjectSanitizedDocSchema(validationRule);
+      validationSchema.validate(data);


NIT/optional: are we going to use a SavedObjectsTypeValidator to validate multiple objects? if so, we could eventually keep the instanciated per-version schemas to avoid reinstantiating them for each validation? Not sure if the perf increase is very significant though.

Yeah I think at least caching the validator per-type would be useful. Will do a small follow-up with this

lukeelmers · 2022-01-05T02:58:11Z

@elasticmachine merge upstream

kibana-ci · 2022-01-05T05:10:28Z

💚 Build Succeeded

Metrics [docs]

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id	before	after	diff
`core`	1000	1001	+1

Unknown metric groups

API count

id	before	after	diff
`core`	2329	2333	+4

History

💚 Build #15583 succeeded fe183e5
💚 Build #15301 succeeded c37b598
💚 Build #13032 succeeded 52e1555
💚 Build #13019 succeeded c593b1f
💔 Build #12689 failed 90e0ad8
💚 Build #7849 succeeded 8833c88

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @lukeelmers

lukeelmers added release_note:plugin_api_changes Contains a Plugin API changes section for the breaking plugin API changes section. Feature:Saved Objects backport:skip This commit does not require backporting v8.1.0 labels Nov 17, 2021

lukeelmers self-assigned this Nov 17, 2021

lukeelmers added 2 commits November 18, 2021 16:00

Adds validations for SO types & enforces them on create.

147d7aa

Update generated docs.

8833c88

lukeelmers force-pushed the feat/so-type-validation branch from 1848c92 to 8833c88 Compare November 18, 2021 23:00

lukeelmers commented Nov 18, 2021

View reviewed changes

lukeelmers changed the title ~~Adds validations for Saved Object types.~~ Adds validations for Saved Object types when calling create or bulkCreate. Nov 18, 2021

lukeelmers marked this pull request as ready for review November 19, 2021 00:49

lukeelmers requested a review from a team as a code owner November 19, 2021 00:49

lukeelmers added the Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc label Nov 19, 2021

TinaHeiligers reviewed Nov 22, 2021

View reviewed changes

src/core/server/saved_objects/types.ts Outdated Show resolved Hide resolved

TinaHeiligers reviewed Nov 22, 2021

View reviewed changes

pgayvallet reviewed Nov 23, 2021

View reviewed changes

Address initial feedback.

aa94b0e

lukeelmers mentioned this pull request Dec 8, 2021

Upgrade Saved Object import & export APIs to beta #90429

Closed

Add validation for root-level properties.

aa43755

lukeelmers commented Dec 10, 2021

View reviewed changes

Merge branch 'main' into feat/so-type-validation

90e0ad8

lukeelmers requested review from pgayvallet and TinaHeiligers December 10, 2021 21:39

lukeelmers added 3 commits December 13, 2021 15:07

Remove busted file that was accidentally committed.

dff697e

Fix broken import.

cd78430

Update generated docs.

c593b1f

Merge branch 'main' into feat/so-type-validation

52e1555

mshustov reviewed Dec 14, 2021

View reviewed changes

src/core/server/saved_objects/validation/types.ts Outdated Show resolved Hide resolved

mshustov reviewed Dec 14, 2021

View reviewed changes

src/core/server/saved_objects/validation/validator.ts Outdated Show resolved Hide resolved

kibanamachine and others added 3 commits December 28, 2021 17:19

Merge branch 'main' into feat/so-type-validation

350678a

Remove support for custom validation functions.

46b0cfa

Update generated docs.

c37b598

Merge branch 'main' into feat/so-type-validation

fe183e5

pgayvallet approved these changes Jan 4, 2022

View reviewed changes

kibanamachine and others added 2 commits January 4, 2022 21:58

Merge branch 'main' into feat/so-type-validation

d737fb8

Address feedback.

5ed7501

lukeelmers enabled auto-merge (squash) January 5, 2022 03:51

lukeelmers disabled auto-merge January 5, 2022 04:53

lukeelmers merged commit ca05637 into elastic:main Jan 5, 2022

lukeelmers deleted the feat/so-type-validation branch January 5, 2022 05:40

This was referenced Jan 5, 2022

[saved objects] Type validation enhancements #122343

Closed

[Saved Objects] introduce per-type validation schemas. #104088

Closed

pgayvallet mentioned this pull request Jan 24, 2022

[SavedObjects] creating objects using undefined or primitives as attributes causes errors #123575

Closed

rudolf mentioned this pull request Aug 30, 2022

Dashboard migrations fail silently during upgrade causing unexpected behaviour thats difficult to detect #139684

Closed

TinaHeiligers mentioned this pull request Nov 29, 2022

SavedObjects: Add migration script to ensure no empty IDs #146346

Closed

rudolf mentioned this pull request Mar 9, 2023

SOR: implement downward-compatible update #152807

Closed

	validateFn({ ...data }); // shallow clone to prevent mutation
	validateFn({ attributes: { ...data } }); // shallow clone to prevent mutation

		const validationSchema = createSavedObjectSanitizedDocSchema(validationRule);
		validationSchema.validate(data);

Adds validations for Saved Object types when calling create or bulkCreate. #118969

Adds validations for Saved Object types when calling create or bulkCreate. #118969

Conversation

lukeelmers commented Nov 17, 2021 • edited Loading

Summary

Plugin API Changes

lukeelmers left a comment

Choose a reason for hiding this comment

lukeelmers Nov 18, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

elasticmachine commented Nov 19, 2021

TinaHeiligers left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lukeelmers left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lukeelmers commented Dec 13, 2021

lukeelmers commented Dec 28, 2021

lukeelmers commented Jan 4, 2022

pgayvallet left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lukeelmers commented Jan 5, 2022

kibana-ci commented Jan 5, 2022

💚 Build Succeeded

Metrics [docs]

Public APIs missing comments

API count

History

Adds validations for Saved Object types when calling `create` or `bulkCreate`. #118969

Adds validations for Saved Object types when calling `create` or `bulkCreate`. #118969

lukeelmers commented Nov 17, 2021 •

edited

Loading

lukeelmers Nov 18, 2021 •

edited

Loading

pgayvallet left a comment •

edited

Loading