-
Notifications
You must be signed in to change notification settings - Fork 360
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question: Are the unused parts of a json schema automatically removed ? #243
Comments
The entire schema is parsed, because beforehand there is no way of knowing which part of the schemas will and will not be used as a
then you're out of luck. You might think you could cull the The thing with A different issue that will lead to increased memory usage is the fact that you use a GoLoader. First the Go struct is converted to JSON and then that JSON is parsed into a There are probably a couple of areas that could be more efficient with memory using general Go optimizations, which anyone with Go knowledge should be able to improve. A smart optimizer that dynamically removes part of the schema like you suggested is just nearly impossible because of the way JSON schema works. |
Thanks for the explanation. I underestimated the difficulty of solving the generic problem. After reading about how it works, I will try to simplify my model before parsing it. It contains about 150 subschemas in its About the memory usage of the GoLoader, I have the following code : goLoader := gojsonschema.NewGoLoader(data)
schemaLoader := gojsonschema.NewSchemaLoader()
schema, err := schemaLoader.Compile(goLoader)
if err != nil {
return err
}
someMap[name] = schema
return nil Do you think the |
Actually the 10.000 identical The The biggest memory hog probably is the I just ran a quick test to see how much a program uses that does nothing but load in your 3mb schema. It uses ~35mb of memory before running Is that in the same order of magnitude you are experiencing? |
Good that the definitions are not parsed each time they are encountered. I realised that it make sense to parse them only once, since we can have circular references. I indeed have this order of magnitude when I load the schema once. However, since the ~30mb of memory usage is actually fine for me, but the more types I support the higher memory usage I get. Perhaps I could use something different than |
If you really want to go down that road the correct approach would be to load the schema into a schemaloader once and then compiling the 150 schemas from that one schemaloader. Currently schemas are not cached between different compilations, but if you'd modify the Then your final code would look something like: jsonLoader := gojsonschema.NewReferenceLoader("file:///path/to/schema/or/w/e")
schemaLoader := gojsonschema.NewSchemaLoader()
schemaLoader.AddSchemas(jsonLoader)
//range over all the properties in definitions
for i := range .... {
schema, err := schemaLoader.Compile(gojsonschema.NewReferenceLoader("http://hl7.org/fhir/json-schema/4.0#/definitions/" + i))
} |
Yes I decided to create the many schemas because of #214. I simply didn't expect the memory usage to be that high 😄 Thank you very much for your long comment and the suggestion ! I will try to implement that soon. |
So I implemented the changes and it's great ! In short, it went from 1.5 second and 680mb of memory allocations to 0.1s and 40mb of memory allocations (before garbage collector).
I decided to make the schemaReferencePool a singleton, because the patch will be quite small to maintain. schemaReferencePool.go's patched var schemaReferencePoolSingleton *schemaReferencePool
func newSchemaReferencePool() *schemaReferencePool {
if schemaReferencePoolSingleton == nil {
schemaReferencePoolSingleton = &schemaReferencePool{}
schemaReferencePoolSingleton.documents = make(map[string]*subSchema)
}
return schemaReferencePoolSingleton
} |
Hi,
I have a fairly large json schema (~3mb). For various reasons, I modify the json schema at runtime before loading it with gojsonschema.NewGoLoader. My changes to the json schema are not that delicate and large parts of the json schema are actually not used anymore.
On another topic, the memory usage of my validators created by gojsonschema is very high and I would like to reduce it.
I could try to edit my json schema more carefully before loading it, to reduce the memory usage. Perhaps it's something already done by gojsonschema so I don't have to do it. Is it the case ? Do you think it would be easy and realistic for a third party contributor to update this library to automatically remove/free the objects created in a jsonschema that are not relevant, if it's not already done ?
The text was updated successfully, but these errors were encountered: