SiqlipVisionModel does not support "device map= auto": no split modules`attribute #31400

lucasjinreal · 2024-06-13T07:22:06Z

Feature request

How to make SiglipVisionModel can support Auto map to map to multiple GPUs.

Motivation

Currently, using MLLM with siglip, the whole model might need automap, since the vision encoder part is SiglipVisionModel, if it doesn;t support , would be a big problme

Also, would consider add flashattention support for Siglip?

Your contribution

None for now

amyeroberts · 2024-06-13T08:32:43Z

Hi @lucasjinreal, thanks for opening a feature request!

Could you share a code snippet of how the model is being created with auto_map and the running environment (run transformers-cli env in the terminal and copy-paste the output)? SigLip should support device_map="auto"

zucchini-nlp · 2024-06-13T08:38:20Z

I checked with the code in model-doc for SigLip and also got an error. Supporting "device_map" in VLMs is indeed needed important. I believe _no_split_modules should be same as in ClipModel

For flashattention afaik current VLMs in transformers use optimized attn implementations only for LLM backbone (e.g. LLaVa supports Flash-Attn and SDPA even though CLIP doesn't). There's an issue for adding SDPA attn (#30565) to all VLMs, I can open another tracker-issue for Flash-Attn but will not able to work on it right now. Open for community contributions

lucasjinreal · 2024-06-13T09:24:18Z

I have surpassed this error, by simply add a _no_split_modules = [] to the attribute.

But it could be better add inside transoformers, it's just a single line. I could submit a PR for this.

As for flashattn, it's a really needs, it can boost vlms training more faster.

zucchini-nlp · 2024-06-13T09:31:41Z

@lucasjinreal cool, PR would be nice but you need to test in multi-gpu setting that everything is being split correctly. I don't think that an empty "split_modules" will work as the most similar CLIP doesn't split at some modules. If you don't have multiple gpus, I can run some tests after the PR is open :)

Flash-Attn noted, thanks, will add to my todo list!

lucasjinreal · 2024-06-13T11:20:29Z

I have been using empty list and make it can be deivicemap auto on multiple GPUs, currently inference is normal. I still didn't know why CLIPVIsionModel should make CLIPENcoderLayer didn't automap though.

zucchini-nlp · 2024-06-17T09:51:46Z

@lucasjinreal i just noticed that SigLip already has _no_split_modules in TextModel and in VisionModel, yet not in the SigLipModel. If I do _no_split_modules=[] as you tried, device mismatch error is raised so we have to add text and vision models' _no_split_modules to enable it

LMK if you're up to opening a PR :)

lucasjinreal · 2024-06-17T14:07:19Z

Hi, In my cased I just using SIglipVisionModel as a parent class and used a SiglipVisionModelSplit(SiglipVisionModel) in my MLLM.

So I think it not appliable to inside of transformers. Let me think a better way to do this

zucchini-nlp · 2024-06-18T06:19:08Z

I believe the best solution is to copy 'no-split-modules' that are already indicated in text-vision components, and add them in SiglipModel's 'no-split-modules'

lucasjinreal added the Feature request Request for a new feature label Jun 13, 2024

NielsRogge changed the title ~~:SiqlipVisionModel does not supportdevice map= au no split modules`attribute~~ SiqlipVisionModel does not support "device map= auto": no split modules`attribute Jun 13, 2024

zucchini-nlp mentioned this issue Jun 24, 2024

Siglip: add _no_split_module #31566

Merged

zucchini-nlp closed this as completed in #31566 Jun 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SiqlipVisionModel does not support "device map= auto": no split modules`attribute #31400

SiqlipVisionModel does not support "device map= auto": no split modules`attribute #31400

lucasjinreal commented Jun 13, 2024

amyeroberts commented Jun 13, 2024

zucchini-nlp commented Jun 13, 2024

lucasjinreal commented Jun 13, 2024 •

edited

Loading

zucchini-nlp commented Jun 13, 2024

lucasjinreal commented Jun 13, 2024

zucchini-nlp commented Jun 17, 2024 •

edited

Loading

lucasjinreal commented Jun 17, 2024

zucchini-nlp commented Jun 18, 2024

SiqlipVisionModel does not support "device map= auto": no split modules`attribute #31400

SiqlipVisionModel does not support "device map= auto": no split modules`attribute #31400

Comments

lucasjinreal commented Jun 13, 2024

Feature request

Motivation

Your contribution

amyeroberts commented Jun 13, 2024

zucchini-nlp commented Jun 13, 2024

lucasjinreal commented Jun 13, 2024 • edited Loading

zucchini-nlp commented Jun 13, 2024

lucasjinreal commented Jun 13, 2024

zucchini-nlp commented Jun 17, 2024 • edited Loading

lucasjinreal commented Jun 17, 2024

zucchini-nlp commented Jun 18, 2024

lucasjinreal commented Jun 13, 2024 •

edited

Loading

zucchini-nlp commented Jun 17, 2024 •

edited

Loading