Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't export GPT2-XL to ONNX: ModelProto exceeds maximum protobuf size of 2GB #4707

Closed
klimentij opened this issue Aug 4, 2020 · 5 comments

Comments

@klimentij
Copy link

klimentij commented Aug 4, 2020

Describe the bug
I use onnxruntime/onnxruntime/python/tools/transformers/benchmark_gpt2.py script to benchmark and export GPT2-XL (1.5B) to ONNX and apply optimizations:

python benchmark_gpt2.py \
--model_name "gpt2-xl" \
--cache_dir "./cache_models" \
--onnx_dir="./gpt2_xl_onnx_past" \
--test_times 10 \
--precision fp16 \
--optimize_onnx \
--use_gpu \
--batch_sizes "1" \
--past_sequence_lengths 1000 \
--result_csv gpt2_results.csv

When I use it for gpt2-large, it works without a problem. When I switch model_name to gpt2-xl, it shows that optimizations are being applied, but fails to save optimized model to disk:

...
Output model to ./gpt2_xl_onnx_past/_past_fp16.onnx
Traceback (most recent call last):
  File "benchmark_gpt2.py", line 258, in <module>
    main()
  File "benchmark_gpt2.py", line 152, in main
    model.config.num_attention_heads, model.config.hidden_size)
  File "/home/jupyter/onnxruntime/onnxruntime/python/tools/transformers/gpt2_helper.py", line 252, in optimize_onnx
    m.save_model_to_file(optimized_model_path)
  File "/home/jupyter/onnxruntime/onnxruntime/python/tools/transformers/onnx_model.py", line 668, in save_model_to_file
    save_model(self.model, output_path, format=None)
  File "/opt/conda/lib/python3.7/site-packages/onnx/__init__.py", line 186, in save_model
    s = _serialize(proto)
  File "/opt/conda/lib/python3.7/site-packages/onnx/__init__.py", line 67, in _serialize
    result = proto.SerializeToString()
ValueError: Message ONNX_REL_1_7.ModelProto exceeds maximum protobuf size of 2GB: 3276141735 

I also added use_external_data_format=True to torch.onnx.export() method in gpt2_helper.py and expected that it would help, but it did not. I can't use other scripts (benchmark.py) because I need GPT2 to be exported with the past state support.

Urgency
I'm blocked on my current GPT2 deployment project because of this issue. The model is approximately 4x more expensive and slow in our production without ONNX optimizations.

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Debian 9
  • ONNX Runtime installed from (source or binary): Binary
  • ONNX Runtime version: 1.4.0
  • Python version: 3.7.6
  • CUDA/cuDNN version: 10.1
  • GPU model and memory: Nvidia T4 16Gb + 51Gb RAM
  • Pytorch version: 1.5.0+cu101 (has to support use_external_data_format flag in torch.onnx.export())

To Reproduce

  • Describe steps/code to reproduce the behavior.
  1. Add use_external_data_format=True to torch.onnx.export() method in gpt2_helper.py
  2. Add gpt2-xl to PRETRAINED_MODELS list in benchmark_gpt2.py
  3. Run:
python benchmark_gpt2.py \
--model_name "gpt2-xl" \
--cache_dir "./cache_models" \
--onnx_dir="./gpt2_xl_onnx_past" \
--test_times 10 \
--precision fp16 \
--optimize_onnx \
--use_gpu \
--batch_sizes "1" \
--past_sequence_lengths 1000 \
--result_csv gpt2_results.csv

Expected behavior
gpt2-xl is benchmarked and exported to ONNX without any error.

@liqunfu
Copy link
Contributor

liqunfu commented Aug 5, 2020

according to onnx.save_model, if you pass in 'f' a path, it will save external tensors which will avoid 2gb limit. May you give onnx_model_path a folder name and see what happens?

@klimentij
Copy link
Author

@liqunfu unfortunately it didn't seem to help. I edited onnx_model.py file so that save_model_to_file function sends a folder name instead of file path to onnx.save_model:

def save_model_to_file(self, output_path):
        
        output_folder = os.path.dirname(output_path)
        
        logger.info(f"Output model to {output_path}")
        logger.info(f"Output model to folder {output_folder}")
        

        if output_path.endswith(".json"):
            assert isinstance(self.model, ModelProto)
            with open(output_path, "w") as out:
                out.write(str(self.model))
        else:
            save_model(self.model, output_folder, format=None)
            #external_data_helper.convert_model_to_external_data(self.model, all_tensors_to_one_file=True, location = output_path + ".data")
            #with open(output_path, "wb") as out:
            #    out.write(self.model.SerializeToString())

I also made sure f is the second parameter in save_model:

save_model(proto, f, format=None)

According to the log, this method is indeed called and output_folder is indeed a folder, but it didn't help:

Output model to ./gpt2_xl_onnx_past/_past_fp16.onnx
Output model to folder ./gpt2_xl_onnx_past
Traceback (most recent call last):
  File "benchmark_gpt2.py", line 258, in <module>
    main()
  File "benchmark_gpt2.py", line 152, in main
    model.config.num_attention_heads, model.config.hidden_size)
  File "/home/jupyter/onnxruntime/onnxruntime/python/tools/transformers/gpt2_helper.py", line 252, in optimize_onnx
    m.save_model_to_file(optimized_model_path)
  File "/home/jupyter/onnxruntime/onnxruntime/python/tools/transformers/onnx_model.py", line 674, in save_model_to_file
    save_model(self.model, output_folder, format=None)
  File "/opt/conda/lib/python3.7/site-packages/onnx/__init__.py", line 186, in save_model
    s = _serialize(proto)
  File "/opt/conda/lib/python3.7/site-packages/onnx/__init__.py", line 67, in _serialize
    result = proto.SerializeToString()
ValueError: Message ONNX_REL_1_7.ModelProto exceeds maximum protobuf size of 2GB: 3276141735

@tianleiwu
Copy link
Contributor

@klimentij, I can reproduce the problem. I will try modify the save_model_to_file function and let you know when there is progress.

@tianleiwu
Copy link
Contributor

tianleiwu commented Aug 6, 2020

@klimentij, It seems that the following change could help export large model to ONNX:

    def save_model_to_file(self, output_path):
            from pathlib import Path
            external_data_helper.convert_model_to_external_data(self.model, all_tensors_to_one_file=True, location = Path(output_path).name + ".data")       
            save_model(self.model, output_path)

The output model will contain two files like name.onnx and name.onnx.data. I'll send a pull request later after more testing.

It seems that the model is very large, so benchmark will get out of memory exception when both PyTorch model and ONNX model are loaded in V100 GPU (with 16GB memory). After ONNX model is exported, use ONNX model only might avoid the problem.

@klimentij
Copy link
Author

Thank you @tianleiwu! I managed to export it to .onnx and .onnx.data files using the edit you suggested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants