Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stream large off heap models #118

Open
jon-morra-zefr opened this issue Nov 26, 2016 · 0 comments
Open

Stream large off heap models #118

jon-morra-zefr opened this issue Nov 26, 2016 · 0 comments
Assignees

Comments

@jon-morra-zefr
Copy link

My current understanding of how Aloha deals with off heap models (right now VW and H2O models) when the model is embedded in the JSON is that it reads the entire JSON and then takes the model part and either writes it to disk or compiles it. This necessitates a JVM twice as large as it needs to be. I think this can be remedied by the following procedure.

  1. Pass through the model JSON and note the start and end locations of the model bytes.
  2. Replace the model tag in the JSON with nothing.
  3. Instantiate the model as regular.
  4. Stream the model bytes either to disk (in the off heap case) or to the compiler (in the H2O) case.

By doing this we should need a much smaller heap size when reading very large VW models which will mean we can use smaller machines when reading large models.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants