-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix test fetcher (doctest) + Idefics2
's doc example
#30274
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1786,17 +1786,13 @@ def forward( | |
>>> from transformers import AutoProcessor, AutoModelForVision2Seq | ||
>>> from transformers.image_utils import load_image | ||
|
||
>>> DEVICE = "cuda:0" | ||
|
||
>>> # Note that passing the image urls (instead of the actual pil images) to the processor is also possible | ||
>>> image1 = load_image("https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg") | ||
>>> image2 = load_image("https://cdn.britannica.com/59/94459-050-DBA42467/Skyline-Chicago.jpg") | ||
>>> image3 = load_image("https://cdn.britannica.com/68/170868-050-8DDE8263/Golden-Gate-Bridge-San-Francisco.jpg") | ||
|
||
>>> processor = AutoProcessor.from_pretrained("HuggingFaceM4/idefics2-8b-base") | ||
>>> model = AutoModelForVision2Seq.from_pretrained( | ||
... "HuggingFaceM4/idefics2-8b-base", | ||
>>> ).to(DEVICE) | ||
>>> model = AutoModelForVision2Seq.from_pretrained("HuggingFaceM4/idefics2-8b-base", device_map="auto") | ||
|
||
>>> BAD_WORDS_IDS = processor.tokenizer(["<image>", "<fake_token_around_image>"], add_special_tokens=False).input_ids | ||
>>> EOS_WORDS_IDS = [processor.tokenizer.eos_token_id] | ||
|
@@ -1805,15 +1801,16 @@ def forward( | |
>>> prompts = [ | ||
... "<image>In this image, we can see the city of New York, and more specifically the Statue of Liberty.<image>In this image,", | ||
... "In which city is that bridge located?<image>", | ||
>>> ] | ||
... ] | ||
>>> images = [[image1, image2], [image3]] | ||
>>> inputs = processor(text=prompts, padding=True, return_tensors="pt").to(DEVICE) | ||
>>> inputs = processor(text=prompts, padding=True, return_tensors="pt").to("cuda") | ||
|
||
>>> # Generate | ||
>>> generated_ids = model.generate(**inputs, bad_words_ids=BAD_WORDS_IDS, max_new_tokens=500) | ||
>>> generated_ids = model.generate(**inputs, bad_words_ids=BAD_WORDS_IDS, max_new_tokens=20) | ||
>>> generated_texts = processor.batch_decode(generated_ids, skip_special_tokens=True) | ||
|
||
>>> print(generated_texts) | ||
['In this image, we can see the city of New York, and more specifically the Statue of Liberty. In this image, we can see the city of New York, and more specifically the Statue of Liberty.\n\n', 'In which city is that bridge located?\n\nThe bridge is located in the city of Pittsburgh, Pennsylvania.\n\n\nThe bridge is'] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The results are not very good. But we don't have it previously, so I don't know if we consider it bad. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm surprised by these generations - they also don't look like the outputs when I was integrating the model 🤔 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. maybe you were using other type of GPUs. CI is using T4. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Quite possibly - I was using There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I can check it tomorrow. But I will merge first 🙏 |
||
```""" | ||
|
||
output_attentions = output_attentions if output_attentions is not None else self.config.output_attentions | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -507,7 +507,7 @@ def get_all_doctest_files() -> List[str]: | |
# change to use "/" as path separator | ||
test_files_to_run = ["/".join(Path(x).parts) for x in test_files_to_run] | ||
# don't run doctest for files in `src/transformers/models/deprecated` | ||
test_files_to_run = [x for x in test_files_to_run if "models/deprecated" not in test_files_to_run] | ||
test_files_to_run = [x for x in test_files_to_run if "models/deprecated" not in x] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. my bad |
||
|
||
# only include files in `src` or `docs/source/en/` | ||
test_files_to_run = [x for x in test_files_to_run if x.startswith(("src/", "docs/source/en/"))] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's very slow, so let's just use
20