Skip to content

Latest commit

 

History

History
24 lines (17 loc) · 934 Bytes

TRAINING.md

File metadata and controls

24 lines (17 loc) · 934 Bytes

Training InstructSeg

Prepare pre-trained model weights

MLLM weights

Loading Mipha-3B pre-trained weights Mipha-3B, and replace --model_name_or_path in training scripts.

CLIP Encoder weights

Loading SigLIP-SO pre-trained weights SigLIP-SO, and replace --vision_tower in training scripts.

Visual Encoder and Segmentation Decoder weights

Loading Mask2Former Swin-B weights Mask2Former, and replace --vision_tower_mask in training scripts.

Now Train !

sh scripts/seg/train.sh

Merge lora weights

After the training stage, merge the output/model/checkpoint-100000 and save the final InstructSeg model weight.

sh scripts/seg/merge_lora_weights.sh