Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
llama : fix memory leak in llama_batch_free (ggml-org#5252)
The llama_batch_init allocates memory for a fixed number of tokens. However, the llama_batch_free only frees memory for the number of tokens that were added to the batch. This change-set uses a null terminated array for the batch seq_id, and frees all the elements until the nullptr is reached. This change-set also changes the name of the first parameter from `n_tokens` to `n_tokens_alloc` to more clearly indicate that this value is the number of tokens allocated to the batch, not the number of tokens in the batch.
- Loading branch information