-
Notifications
You must be signed in to change notification settings - Fork 503
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Saving external data for large ONNX models #255
Conversation
The documentation is not available anymore as the PR was closed or merged. |
With the latest commit, we're now able to do: model = ORTModelForCausalLM.from_pretrained(
model_ckpt,
use_auth_token=True,
from_transformers=True,
cache_dir="model_cache",
onnx_cache_dir="./onnx_cache", # saves ONNX model with external data if large model to "./onnx_cache"
)
model = ORTModelForCausalLM.from_pretrained(
model_ckpt,
use_auth_token=True,
from_transformers=True,
cache_dir="model_cache" # like previous behaviour where `onnx_cache_dir`= `cache_dir`
) And |
The following should be working now # load small ONNX model
model = ORTModelForCausalLM.from_pretrained("nouamanetazi/bloom-small-testing-onnx", use_auth_token=True)
# load large ONNX model (>2GB) by specifying folder containing model's weights
model = ORTModelForCausalLM.from_pretrained("nouamanetazi/bloom-350m-onnx-folder", use_auth_token=True, onnx_folder="onnx") Example of uploading a large ONNX model (>2GB) to the hub from pathlib import Path
from transformers import AutoTokenizer
from optimum.onnxruntime import ORTModelForCausalLM
import shutil
from huggingface_hub import HfApi
model_ckpt = "bigscience/bloom-350m"
save_path = Path(f"saved_model/{model_ckpt}")
save_path.mkdir(parents=True, exist_ok=True)
tokenizer = AutoTokenizer.from_pretrained(model_ckpt, use_auth_token=True)
model = ORTModelForCausalLM.from_pretrained(
model_ckpt,
use_auth_token=True,
from_transformers=True,
onnx_cache_dir="./onnx_cache", # saves ONNX model to "./onnx_cache"
)
# save to local folder
model.save_pretrained(save_path / "onnx")
shutil.move(save_path / "onnx" / "config.json", save_path / "config.json")
tokenizer.save_pretrained(save_path)
# push to hub
repo_id = "nouamanetazi/bloom-350m-onnx-folder-test"
api = HfApi()
api.create_repo(repo_id=repo_id, exist_ok=True)
api.upload_folder(folder_path=save_path, repo_id=repo_id, path_in_repo=".", repo_type="model") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make sure tests are passing and style is correct?
We can now save/load large
|
Awesome! I think it would be great to add tests, essentially that saving / reloading works well, in the encoder-only/encoder-decoder cases. |
optimum/onnxruntime/utils.py
Outdated
import onnx | ||
from onnx.external_data_helper import ExternalDataInfo, _get_initializer_tensors | ||
|
||
model_paths = src_file_names.copy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a list of Paths so I don't think it copies anything here, we might as well start from a new empty list, and fill it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked without the copy()
and the extending does modify the list src_file_names
unfortunately :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, my point is that you can create an empty list
model_paths = []
And fill it as you go?
My point here is that src_files_names[0]
will be the same instance of model_paths[0]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry I don't quite get it. I only use model_paths
to iterate over the inputted src_files_names
. Then I keep on filling src_files_names
model_paths = []
for model_path in model_paths:
# load model graph
model = onnx.load(str(model_path), load_external_data=False)
# filter out tensors that are not external data
model_tensors = _get_initializer_tensors(model)
model_tensors_ext = [
ExternalDataInfo(tensor).location
for tensor in model_tensors
if tensor.HasField("data_location") and tensor.data_location == onnx.TensorProto.EXTERNAL
]
src_paths.extend([model_path.parent / tensor_name for tensor_name in model_tensors_ext])
dst_file_names.extend(model_tensors_ext)
return src_paths, dst_file_names
So this shouldn't work
For the tests, it would be cool if we could enforce saving a small model in external data format. I tried looking quickly for a way, but there doesn't seem to be an easy way to bypass the 2GB protobuf file limit. |
You could use the following api to convert a small model to external data. converting-an-onnx-model-to-external-data The size threshold has to be low so that it can create the files. |
Hi @NouamaneTazi, thanks for the PR, it would require a small change to handle one more use-case for the Taking an example of Possible fix is to save them in folders like, |
We should probably do the same in exporters actually |
…/NouamaneTazi/255
I'm trying to write tests for saving/loading with external data, but it's not as trivial as it seems. model = ORTModelForSeq2SeqLM.from_pretrained(self.ONNX_SEQ2SEQ_MODEL_ID, use_cache=True)
model.save_pretrained(tmpdirname)
# load model proto
onnx_model = onnx.load(str(model.model_path))
# save external data
os.makedirs(str(model.model_path.parent / "external_data"), exist_ok=True)
onnx.save_model(onnx_model, str(model.model_path.parent / "external_data" / "model.onnx"), save_as_external_data=True, all_tensors_to_one_file=False, size_threshold=8, convert_attribute=False)
# need to do this for encoder/decoder/decoder_with_past But again this wouldn't test our I'm open for suggestions, or else we can merge this for now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
@NouamaneTazi Why not use actual >2GB models, initialized and saved random from transformers (so no download time)? So no need of custom logic. |
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>
@fxmarty Yes definitely! I can use a randomly intialized model, but It seems there's no exposed API to load for example |
You can do |
@NouamaneTazi model_ckpt = "facebook/mbart-large-en-ro" tokenizer = AutoTokenizer.from_pretrained(model_ckpt) Log: ** Environment |
I have similar issue mentioned here: #589 (comment) |
Migrated this PR to #586 |
What does this PR do?
Fixes #254 and #377
We can now load and save ORT models that have external data 🚀