Loading Mipha-3B pre-trained weights Mipha-3B, and replace --model_name_or_path in training scripts.
Loading SigLIP-SO pre-trained weights SigLIP-SO, and replace --vision_tower in training scripts.
Loading Mask2Former Swin-B weights Mask2Former, and replace --vision_tower_mask in training scripts.
sh scripts/seg/train.sh
After the training stage, merge the output/model/checkpoint-100000 and save the final InstructSeg model weight.
sh scripts/seg/merge_lora_weights.sh