-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inference error on trained checkpoints #17
Comments
Ok some improvement: I changed line 29 on utils/saving_utils.py to: And got the network to run. However, I am now only getting entirely green boxes, no correct masks. Prior to itr_00055000_u2net.pth, the images are largely what looks like canny edge detection with red lines. But by itr_00060000_u2net.pth they are all green boxes. Any suggestions based on your training process? |
hello |
Hi @richard-schwab @yaomilwaukee For pretrained model download with inference script for Alpha Image generation and Cloth segmentation, please check this Have cleaned, modified the scripts to run on latest pytorch and other libraries and Its very simple to use as well. You guys can also check the live demo at huggingface - https://huggingface.co/spaces/wildoctopus/cloth-segmentation If you guys like this, please drop a star to my repo. Thanks :) |
Thank yoy very much! You saved my life!) |
I have exactly the same issue.
Model can be loaded only with strict=False: What could be the reason for it? |
Hi @Antonytm, The reason behind this the number of layers/parameters in the cloth segmentation model and the pretrained model being used to train the Cloth Seg model is different. So to initialize weights for the matching layers, models are loaded with "strict=False". |
Sorry, I am a newbie in ML. I do the next steps:
Is anything missed? Is anything wrong? P.S. Your pre-trained model from Hugging Face works fine. However, I want to try using this approach for the segmentation of different objects(not clothes). And, first of all, I want to repeat all steps to be able to build the original model. |
I printed my model and pre-trained, they are exactly the same:
|
Hi @Antonytm , You are following the correct steps. Can you please check your base otions settings. Run the model surgery and run train. It should work. No other settings needs to be modified to run successfully in default case. Let me know if you still facing any issue. And when you try to run for custom dataset and problem , changes will be needed based on the out channel, so those needful changes in Model_surgery file as well. |
@wildoctopus |
hello Alok , |
|
Thank you, I will try it! (in a week or so) |
@Kaustubh-cpu |
Hi,
I ran the training script as your instructions which worked very well thank you. However, I'm getting an error when attempting to use my newly trained weights.
I changed the line of infer.py as such:
checkpoint_path = os.path.join("trained_checkpoint", "cloth_segm_u2net_latest.pth")
to
checkpoint_path = "results/training_cloth_segm_u2net_exp1/checkpoints/itr_00100000_u2net.pth"
And I can see the file sizes of the checkpoints aren't the same:
original:
$ ls -al trained_checkpoint/cloth_segm_u2net_latest.pth
-rw-r--r-- 1 user user 176625341 Mar 12 21:23 trained_checkpoint/cloth_segm_u2net_latest.pth
newly trained:
$ ls -al results/training_cloth_segm_u2net_exp1/checkpoints/itr_00100000_u2net.pth
-rw-r--r-- 1 user user 176607205 Mar 14 09:09 results/training_cloth_segm_u2net_exp1/checkpoints/itr_00100000_u2net.pth
The error I'm getting seems to drop the stageX names from the layers in the state dict. Any ideas?
The text was updated successfully, but these errors were encountered: