Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug using revision param in AutoModelForCausalLM.from_pretrained #23745

Closed
1 of 4 tasks
Forbu opened this issue May 24, 2023 · 7 comments · Fixed by #23799
Closed
1 of 4 tasks

Bug using revision param in AutoModelForCausalLM.from_pretrained #23745

Forbu opened this issue May 24, 2023 · 7 comments · Fixed by #23799

Comments

@Forbu
Copy link

Forbu commented May 24, 2023

System Info

2023-05-24 23:09:53.575434: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
WARNING:tensorflow:From /usr/local/lib/python3.10/dist-packages/transformers/commands/env.py:63: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.config.list_physical_devices('GPU') instead.
2023-05-24 23:10:05.261610: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:47] Overriding orig_value setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.

Copy-and-paste the text below in your GitHub issue and FILL OUT the two last points.

  • transformers version: 4.29.2
  • Platform: Linux-5.15.107+-x86_64-with-glibc2.31
  • Python version: 3.10.11
  • Huggingface_hub version: 0.14.1
  • Safetensors version: not installed
  • PyTorch version (GPU?): 2.0.1+cu118 (True)
  • Tensorflow version (GPU?): 2.12.0 (True)
  • Flax version (CPU?/GPU?/TPU?): 0.6.9 (gpu)
  • Jax version: 0.4.10
  • JaxLib version: 0.4.10
  • Using GPU in script?:
  • Using distributed or parallel set-up in script?:

Who can help?

@ArthurZucker
@younesbelkada

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

I was trying to use the new shiny mpt model from the huggingface hub from a revision :

from transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer
import torch
import transformers
import accelerate

model_name = 'mosaicml/mpt-7b'

model = AutoModelForCausalLM.from_pretrained(model_name, 
                                            trust_remote_code=True,
                                            revision="refs/pr/23",
                                            device_map="auto"
                                          )

But I stumble on this error after the using the above code :
ValueError: MPTForCausalLM does not support device_map='auto' yet.

The "auto" was indeed not supported in the main branch but we add a correction in the PR branch (so the argument revision="refs/pr/23")
I did some investigation and the model was indeed loading the main .py files :

Downloading (…)main/modeling_mpt.py: 100%
17.4k/17.4k [00:00<00:00, 1.12MB/s]
Downloading (…)in/param_init_fns.py: 100%
12.6k/12.6k [00:00<00:00, 971kB/s]
Downloading (…)resolve/main/norm.py: 100%
2.56k/2.56k [00:00<00:00, 131kB/s]
A new version of the following files was downloaded from https://huggingface.co/mosaicml/mpt-7b:
- norm.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
A new version of the following files was downloaded from https://huggingface.co/mosaicml/mpt-7b:
- param_init_fns.py
- norm.py

You can see the main/ here. I did manually check the modeling_mpt.py file it didn't have the PR changes.

So I did try to find where the bug where inside the transformers package ... (first time looking at the code).
I am a bit surprised !
Basicly the code rewrite the config values after having read it (it adds the information about the repo ids (in add_model_info_to_auto_map in generic.py in utils/ from the transformers package) something that seems normal.

"auto_map": {
    "AutoConfig": "mosaicml/mpt-7b--configuration_mpt.MPTConfig",
    "AutoModelForCausalLM": "mosaicml/mpt-7b--modeling_mpt.MPTForCausalLM"
  }

It notably add the "--" string.
then in get_class_from_dynamic_module (in dynamic_module_utils.py) it has :

if "--" in class_reference:
        repo_id, class_reference = class_reference.split("--")
        # Invalidate revision since it's not relevant for this repo
        revision = "main"

So the revision become "main" and from here we are done.
I suppose if i do a PR removing the revision overide some people will not be happy ?

Expected behavior

The expected behaviour is to load the file from the PR branch. (not the main/)

@ArthurZucker
Copy link
Collaborator

cc @sgugger who is more familiar with this, I won't have bandwidth to dive into this now.

@sgugger
Copy link
Collaborator

sgugger commented May 25, 2023

The revision argument is supported for weights but not for the code at the moment. Support will be added soon, but in the meantime you can download the revision for this repo and then use from_pretrained with a local folder and it will work.

@Forbu
Copy link
Author

Forbu commented May 25, 2023

Nice thanks you @sgugger !

@samhavens
Copy link

@sgugger isn't this a security issue? When using trust_remote_code=True, there is a warning to explicitly pass a revision to make sure that you are running code that you have looked at. But IIUC, if you pass a revision="commit SHA I have verified" it will actually load whatever code is on the main branch?

@sgugger
Copy link
Collaborator

sgugger commented May 25, 2023

@samhavens This comes from the recent change we made to avoid duplicating the code files in all repos (now there is one source of truth). As I said we're working on a fix, should come tomorrow/early next week.

@sgugger
Copy link
Collaborator

sgugger commented May 26, 2023

If you want to give it a try, the PR linked above should fix your issue.

@samhavens
Copy link

Thanks @sgugger!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants