Fix memory issues in multitask generator #57
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When having a large number or samples (≳1000), the multitask generator was typically crashing due to out-of-memory errors, especially when running on a GPU. The crash happens while generating the next batch of configurations to evaluate, at the point where the acquisition function is optimized. This optimization begins by evaluating the acquisition function at 1000 points in a single batch (by default), which can result in a matrix that is larger than the available memory.
This PR adds the option of evaluating these points in several batches, so that each batch can actually fit in memory. This is done by passing the
init_batch_limit
tom.gen
. Initially, the generator tries to optimize the acquisition function withinit_batch_limit=1000
. If this triggers an out-of-memory error,init_batch_limit
is divided by2
and a new trial is attempted. This process is repeated until the generator runs successfully.Related issues: pytorch/botorch#366, cornellius-gp/gpytorch#647, cornellius-gp/gpytorch#1978.