-
-
Notifications
You must be signed in to change notification settings - Fork 887
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
anyOf
/oneOf
with recursive schemas causes out of memory exception
#703
Comments
ok here's a JS reproduction: const schema = {
"$schema": "http://json-schema.org/draft-06/schema#",
anyOf: [ { $ref: '#/definitions/generic' } ],
definitions: {
generic: {
type: 'object',
anyOf: [
{ $ref: '#/definitions/model1' }
]
},
genericChildren: {
type: 'array',
items: { $ref: '#/definitions/generic' },
additionalItems: false
},
model1: {
type: 'object',
properties: {
type: {
type: 'string',
enum: [ 'model1']
},
model1_a: { type: 'string' },
model1_b: { type: 'string' },
children: { $ref: '#/definitions/genericChildren' }
},
required: ['type'],
additionalProperties: false
}
}
};
for (let i = 2; i < 20; i++) {
schema.definitions[`model${i}`] = {
type: 'object',
properties: {
type: {
type: 'string',
enum: [`model${i}`]
},
children: { $ref: '#/definitions/genericChildren' }
},
required: ['type'],
additionalProperties: false
};
schema.definitions.generic.anyOf.push({
$ref: `#/definitions/model${i}`
});
}
const json = {
type: 'model1',
model1_a: 'foo',
model1_b: 'bar',
children: [{
type: 'model2',
nonExistent: null
}]
};
//editor.set(json);
const ajv = new Ajv({
allErrors: true
});
ajv.addMetaSchema(await getJSON('node_modules/ajv/lib/refs/json-schema-draft-06.json'));
const valid = ajv.validate(schema, json);
If we instead pass the following data: {
"type": "model1",
"model1_a": "foo",
"model1_b": "bar",
"children": [{
"type": "model2",
"children": [{
"type": "model2",
"nonExistent": null
}]
}]
} It is one level deeper, we now get 14,135 errors in As you can see, in this schema, if a very deep child is invalid we will easily reach chrome (or node's) maximum memory allocation. |
You specifically instructed ajv to collect all possible errors ( To have a smaller number of more useful errors you need to specifically tell ajv which branch it should and which branch it shouldn't try to validate against. draft-07 introduced conditional {
"anyOf": [
{"$ref": "#/definitions/model1"},
{"$ref": "#/definitions/model2"}
]
} you could use something like {
"properties": {"type": {"enum": ["model1", "model2"]}},
"allOf": [
{
"if": {"properties": {"type": {"const": "model1"}}},
"then": {"$ref": "#/definitions/model1"},
},
{
"if": {"properties": {"type": {"const": "model2"}}},
"then": {"$ref": "#/definitions/model2"},
}
]
} You obviously can do it without allOf, using only if/then/else, but if you have many possible types the above way seems better to me - it allows to avoid deep nesting of if/then/else. EDIT: On another hand, with if/then/else only, without allOf, it can be faster, as in the schema above would have to compare "type" property with every possible type value, while with if/then/else it would stop as soon as the first one matches... {
"if": {"properties": {"type": {"const": "model1"}}},
"then": {"$ref": "#/definitions/model1"},
"else": {
"if": {"properties": {"type": {"const": "model2"}}},
"then": {"$ref": "#/definitions/model2"},
"else": false
}
} |
yup I figured I understand why it does it (because as you said, it can't know which of the I'll give that a go. |
If the problem is indeed stack size because of deeply nested recursive data (it should be really deep for it to be a problem though), you can consider using async schemas - it should help with the stack size (negatively affecting performance though. But as I understood, your problem is the large size of error array. You can also try not using |
There is a proposal for select that makes above even more concise. It depends on $data proposal, so it's unlikely that {
"select": {"$data": "0/type"},
"selectCases": {
"model1": { "$ref": "#/definitions/model1" },
"model2": { "$ref": "#/definitions/model2" }
},
"selectDefault": false
} But bear in mind that it has some limitations (they aren't critical though - see issues there). |
no worries, i'll try with what spec is also, just to confirm i understand this. i do get quite a few errors still but far, far less than with so lets say i have 4 entries in my |
yes
yes, if this error means failure. That's the default.
each branch of anyOf will stop validating on the first error in each branch, but all branches must be validated as some branch may pass. All errors that were collected will be reported. "anyOf" always stops validating on the first passing branch. "allOf" stops validating on the first failing branch, but in allErrors mode it will always validate against all branches. "oneOf" stops validating on the second passing branch, otherwise it will always validate against all branches (as only one branch allowed to be valid) - that's why anyOf generally better than oneOf. |
In all cases without allErrors it will stop as early as possible and with allErrors collect as many errors as possible. |
yup sounds good to me. i did switch from I totally didn't realise if/then was in one of the drafts already so i'll give that a go. Seems most of my problems are solved by Thanks for the quick responses, i figured it would be schema related or config rather than a bug but wasn't sure. The whole idea of branch selection is really what we need. To say "based on the type, use this associated schema". |
Try select too - I'm sure it'll get into the spec some day :) See above if you missed. Actually, the more people use it and support it in JSON-Schema-org the more likely it to become the standard. |
I'm using the latest version of ajv:
Essentially it seems there's a huge inefficiency somewhere when dealing with recursive schemas.
I have a fairly large schema so I'm unable to extract an example of this happening (I'll keep trying to narrow it down). But essentially I have something along the lines of:
If you have data like:
As you can see,
model2
will be invalid, as expected. BecauseadditionalProperties
is false.This does validate correctly (assuming what I wrote above off the top of my head is syntactically correct).
However, it produces a huge amount of errors internally. In my case, I get 2,900+ errors. This seems directly proportional to how many schemas exist in the
anyOf
ofgeneric
.In cases where I have a large object, deeply nested, I often get 1 million or more errors from ajv. Most of these errors look like duplicates or at least very similar to each other.
I've stepped through ajv's generated code for some time now and didn't really get anywhere of use. All I found out was:
anyOf
entries (which is already a lot, but not the thousands I saw)anyOf
(due to itschildren
not being matched either), so it too generates the same amount of errors, if not moreoneOf
causes performance to decrease massively because, unlikeanyOf
, it won't quit early once it matchesanyOf
entries, it seems it will produce errors for all other entries (even if they may not be used in the end)This causes node to throw an out of memory exception due to the max heap size being reached. In the browser, it will crash the page (along with dev tools). Essentially the
vErrors
array grows to such a large size that it can use 2GB+ of memory in my case.The text was updated successfully, but these errors were encountered: