-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SiqlipVisionModel does not support "device map= auto": no split modules`attribute #31400
Comments
Hi @lucasjinreal, thanks for opening a feature request! Could you share a code snippet of how the model is being created with |
I checked with the code in model-doc for SigLip and also got an error. Supporting "device_map" in VLMs is indeed needed important. I believe For |
I have surpassed this error, by simply add a _no_split_modules = [] to the attribute. But it could be better add inside transoformers, it's just a single line. I could submit a PR for this. As for flashattn, it's a really needs, it can boost vlms training more faster. |
@lucasjinreal cool, PR would be nice but you need to test in multi-gpu setting that everything is being split correctly. I don't think that an empty "split_modules" will work as the most similar CLIP doesn't split at some modules. If you don't have multiple gpus, I can run some tests after the PR is open :) Flash-Attn noted, thanks, will add to my todo list! |
I have been using empty list and make it can be deivicemap auto on multiple GPUs, currently inference is normal. I still didn't know why CLIPVIsionModel should make CLIPENcoderLayer didn't automap though. |
@lucasjinreal i just noticed that SigLip already has LMK if you're up to opening a PR :) |
Hi, In my cased I just using SIglipVisionModel as a parent class and used a SiglipVisionModelSplit(SiglipVisionModel) in my MLLM. So I think it not appliable to inside of transformers. Let me think a better way to do this |
I believe the best solution is to copy 'no-split-modules' that are already indicated in text-vision components, and add them in SiglipModel's 'no-split-modules' |
Feature request
How to make SiglipVisionModel can support Auto map to map to multiple GPUs.
Motivation
Currently, using MLLM with siglip, the whole model might need automap, since the vision encoder part is SiglipVisionModel, if it doesn;t support , would be a big problme
Also, would consider add flashattention support for Siglip?
Your contribution
None for now
The text was updated successfully, but these errors were encountered: