Load safetensors directly to cuda #2445

Daniel-Kelvich · 2023-02-21T08:25:16Z

As far as I know there is not way right now to load a model from a safetensors file directly to cuda. You always have to load it to cpu first. Safetensors library supports loading directly to cuda, so it shouldn't be hard to add this functionality to diffusers pipelines.

Interface may look like this (just specify device in init function)
pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16, device='cuda:0')

The text was updated successfully, but these errors were encountered:

sayakpaul · 2023-02-23T02:50:14Z

Cc: @pcuenca

patrickvonplaten · 2023-02-23T07:46:35Z

Hey @Daniel-Kelvich,

It should be possible since this PR: huggingface/accelerate#1028

Can you make sure to upgrade accelerate:

pip install --upgrade accelerate

and then you can load the model directly on GPU with safetensors:

pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16, device_map="auto")

Can you check if this works?

Daniel-Kelvich · 2023-02-23T18:05:27Z

Hi @patrickvonplaten,

For me it either loads to cpu to uses torch.load().

Here is the libraries and a minimal code to reproduce.

accelerate==0.16.0
diffusers==0.13.1
torch==1.13.1

import torch
from diffusers import StableDiffusionPipeline
import time

def _raise():
    raise RuntimeError("I don't want to use pickle")
torch.load = lambda *args, **kwargs: _raise()

t1=time.time()
model_id = "dreamlike-art/dreamlike-diffusion-1.0"
pipe = StableDiffusionPipeline.from_pretrained(model_id, device_map='auto', torch_dtype=torch.float16)
print(f'{time.time()-t1} sec')

print(pipe.device)

Daniel-Kelvich · 2023-02-23T18:13:34Z

Can you please provide a minimal working example of loading safetensors to gpu?

patrickvonplaten · 2023-03-06T13:24:56Z

Hey @Daniel-Kelvich,

Can you make sure to have safetensors installed as well? Note that if safetensors is not installed, the weights will automatically be loaded with torch.load.

Once the weights are downloaded the following code snippet:

import torch
from diffusers import StableDiffusionPipeline
import time

def _raise():
    raise RuntimeError("I don't want to use pickle")
torch.load = lambda *args, **kwargs: _raise()

t1=time.time()
model_id = "dreamlike-art/dreamlike-diffusion-1.0"
pipe = StableDiffusionPipeline.from_pretrained(model_id, device_map='auto', torch_dtype=torch.float16)
print(f'{time.time()-t1} sec')

print(pipe.device)

currently sadly yields an error:

│ /home/patrick_huggingface_co/python_bin/accelerate/utils/modeling.py:670 in load_state_dict      │
│                                                                                                  │
│   667 │   │   with safe_open(checkpoint_file, framework="pt") as f:                              │
│   668 │   │   │   metadata = f.metadata()                                                        │
│   669 │   │   │   weight_names = f.keys()                                                        │
│ ❱ 670 │   │   if metadata.get("format") not in ["pt", "tf", "flax"]:                             │
│   671 │   │   │   raise OSError(                                                                 │
│   672 │   │   │   │   f"The safetensors archive passed at {checkpoint_file} does not contain t   │
│   673 │   │   │   │   "you save your model with the `save_pretrained` method."                   │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
AttributeError: 'NoneType' object has no attribute 'get'

However, this is fixed in: huggingface/accelerate#1151

Using this PR the above code runs as expected.

4.564812898635864 sec
cuda:0

Daniel-Kelvich · 2023-03-06T19:23:23Z

@patrickvonplaten Hi! I still get this error, even though I have updated the deps.

│   667 │   │   with safe_open(checkpoint_file, framework="pt") as f:                              │
│   668 │   │   │   metadata = f.metadata()                                                        │
│   669 │   │   │   weight_names = f.keys()                                                        │
│ ❱ 670 │   │   if metadata.get("format") not in ["pt", "tf", "flax"]:                             │
│   671 │   │   │   raise OSError(                                                                 │
│   672 │   │   │   │   f"The safetensors archive passed at {checkpoint_file} does not contain t   │
│   673 │   │   │   │   "you save your model with the `save_pretrained` method."

diffusers==0.15.0.dev0
accelerate==0.17.0.dev0
safetensors==0.3.0

patrickvonplaten · 2023-03-07T13:00:53Z

Hey @Daniel-Kelvich,

Can you try again after running:

pip uninstall accelerate
pip install git+https://github.com/huggingface/accelerate.git

?

Daniel-Kelvich · 2023-03-08T12:14:11Z

It seems to work now, but it is pretty slow. There's no point of loading a model to gpu directly.

patrickvonplaten · 2023-03-09T12:06:37Z

Yeah, I'm also not 100% sure in which use cases it improves performance

patrickvonplaten mentioned this issue Mar 6, 2023

[Safetensors] Relax missing metadata constraint huggingface/accelerate#1151

Merged

Daniel-Kelvich closed this as completed Mar 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Load safetensors directly to cuda #2445

Load safetensors directly to cuda #2445

Daniel-Kelvich commented Feb 21, 2023

sayakpaul commented Feb 23, 2023

patrickvonplaten commented Feb 23, 2023

Daniel-Kelvich commented Feb 23, 2023

Daniel-Kelvich commented Feb 23, 2023

patrickvonplaten commented Mar 6, 2023

Daniel-Kelvich commented Mar 6, 2023 •

edited

Loading

patrickvonplaten commented Mar 7, 2023

Daniel-Kelvich commented Mar 8, 2023

patrickvonplaten commented Mar 9, 2023

Load safetensors directly to cuda #2445

Load safetensors directly to cuda #2445

Comments

Daniel-Kelvich commented Feb 21, 2023

sayakpaul commented Feb 23, 2023

patrickvonplaten commented Feb 23, 2023

Daniel-Kelvich commented Feb 23, 2023

Daniel-Kelvich commented Feb 23, 2023

patrickvonplaten commented Mar 6, 2023

Daniel-Kelvich commented Mar 6, 2023 • edited Loading

patrickvonplaten commented Mar 7, 2023

Daniel-Kelvich commented Mar 8, 2023

patrickvonplaten commented Mar 9, 2023

Daniel-Kelvich commented Mar 6, 2023 •

edited

Loading