Stream large off heap models #118

jon-morra-zefr · 2016-11-26T16:57:35Z

My current understanding of how Aloha deals with off heap models (right now VW and H2O models) when the model is embedded in the JSON is that it reads the entire JSON and then takes the model part and either writes it to disk or compiles it. This necessitates a JVM twice as large as it needs to be. I think this can be remedied by the following procedure.

Pass through the model JSON and note the start and end locations of the model bytes.
Replace the model tag in the JSON with nothing.
Instantiate the model as regular.
Stream the model bytes either to disk (in the off heap case) or to the compiler (in the H2O) case.

By doing this we should need a much smaller heap size when reading very large VW models which will mean we can use smaller machines when reading large models.

jmorra added the enhancement label Nov 26, 2016

jmorra assigned deaktator Nov 26, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stream large off heap models #118

Stream large off heap models #118

jon-morra-zefr commented Nov 26, 2016

Stream large off heap models #118

Stream large off heap models #118

Comments

jon-morra-zefr commented Nov 26, 2016