Missing preprocessor_config.json file after training segformer model #29790

evanwong1020 · 2024-03-21T20:27:32Z

System Info

transformers version: 4.37.2
Platform: Windows-10-10.0.19045-SP0
Python version: 3.11.5
Huggingface_hub version: 0.20.3
Safetensors version: 0.4.2
Accelerate version: not installed
Accelerate config: not found
PyTorch version (GPU?): not installed (NA)
Tensorflow version (GPU?): not installed (NA)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using GPU in script?: No
Using distributed or parallel set-up in script?: No

Who can help?

No response

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

Follow the Image Classification example in the HuggingFace docs: https://huggingface.co/docs/transformers/tasks/semantic_segmentation
Upon reaching the code segment

You should end up with an error that says
OSError: segformer-b0-scene-parse-150 does not appear to have a file named preprocessor_config.json. Checkout 'https://huggingface.co/segformer-b0-scene-parse-150/None' for available files.

Expected behavior

I'd assume a preprocessor_config.json file should be generated in the training process.

The text was updated successfully, but these errors were encountered:

NielsRogge · 2024-03-24T09:54:31Z

Hi, thanks for your interest in SegFormer.

This is because when training the model, one could also provide the image processor to the Trainer class, so that it gets saved along the model weights to the Trainer's output_dir. At the moment, one needs to do Trainer(..., tokenizer=image_processor) when instantiating the trainer, but we'll update this as the Trainer was initially designed only for text-only models.

The pipeline expects a path to a folder containing both the modeling files (model weights, a config.json) and the image preprocessor file (preprocessor_config.json).

evanwong1020 · 2024-03-26T01:57:56Z

When passing the tokenizer=image_processor parameter into the trainer, I get an attribute error:

AttributeError: 'SegformerImageProcessor' object has no attribute 'pad'

NielsRogge · 2024-03-26T19:38:35Z

Ok thanks for reporting. Noticed that currently, when you pass tokenizer=image_processor, then the DataCollatorWithPadding is used as seen here. Hence a current workaround is to do the following:

from transformers import default_data_collator

trainer = Trainer(...
            tokenizer=image_processor,
            data_collator=default_data_collator
)

However, this should be fixed:

the "tokenizer" argument would need to extended, or the "image_processor" argument would need to be added to the Trainer (to allow for passing something different besides a tokenizer)
the default data collator must be used in case an image processor is passed.

NielsRogge mentioned this issue Mar 27, 2024

[Trainer] Allow passing image processor #29896

Merged

1 task

evanwong1020 mentioned this issue Apr 4, 2024

Image Segmentation Guide missing dependencies/imports and overwritten variables #30058

Closed

4 tasks

NielsRogge closed this as completed in #29896 Apr 5, 2024

NielsRogge mentioned this issue Apr 8, 2024

[Trainer] Undo #29896 #30129

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Missing preprocessor_config.json file after training segformer model #29790

Missing preprocessor_config.json file after training segformer model #29790

evanwong1020 commented Mar 21, 2024 •

edited

Loading

NielsRogge commented Mar 24, 2024

evanwong1020 commented Mar 26, 2024

NielsRogge commented Mar 26, 2024

Missing preprocessor_config.json file after training segformer model #29790

Missing preprocessor_config.json file after training segformer model #29790

Comments

evanwong1020 commented Mar 21, 2024 • edited Loading

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

NielsRogge commented Mar 24, 2024

evanwong1020 commented Mar 26, 2024

NielsRogge commented Mar 26, 2024

evanwong1020 commented Mar 21, 2024 •

edited

Loading