Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle the model unload requests in model adapter controller #42

Closed
Jeffwan opened this issue Jul 25, 2024 · 3 comments · Fixed by #152
Closed

Handle the model unload requests in model adapter controller #42

Jeffwan opened this issue Jul 25, 2024 · 3 comments · Fixed by #152
Assignees
Labels
area/lora kind/feature Categorizes issue or PR as related to a new feature. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now.
Milestone

Comments

@Jeffwan
Copy link
Collaborator

Jeffwan commented Jul 25, 2024

🚀 Feature Description and Motivation

After the model adapter is deleted, we should unload the model adapter from the pod. Otherwise, it will wait for the GC to evict the Lora adapter. We definitely prefer eager way over lazy way here.

We need to response the deletion event and send request to the corresponding pod. There's no way to update the model adapter then.

BTW, we could also leverage finalizer to make it. .metadata.deletionTimestamp will be marked first and the finalizer can help unload the model.

Use Case

No response

Proposed Solution

No response

Alternatives Considered

No response

Additional Context

No response

@Jeffwan Jeffwan added area/lora kind/feature Categorizes issue or PR as related to a new feature. labels Jul 25, 2024
@Jeffwan Jeffwan modified the milestones: v0.1.0-rc.0, v0.1.0-rc.1 Jul 25, 2024
@Jeffwan Jeffwan self-assigned this Aug 29, 2024
@Jeffwan
Copy link
Collaborator Author

Jeffwan commented Sep 5, 2024

Some edge cases related to unload

  1. Pod deletion -> we should update the lora crd status
  2. CRD deletion -> finalizer -> clean up the lora on the pods

@Jeffwan Jeffwan added the priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. label Sep 5, 2024
@Jeffwan Jeffwan changed the title Handle the model unload requests Handle the model unload requests in model adapter controller Sep 5, 2024
@varungup90
Copy link
Collaborator

I can take this task, if no one is working on this actively.

@Jeffwan
Copy link
Collaborator Author

Jeffwan commented Sep 5, 2024

@varungup90 I am working some improvements for model adapter. Please start this one later to avoid merging conflicts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/lora kind/feature Categorizes issue or PR as related to a new feature. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants