Skip to content

Multiple Object Tracking framework based on transformers for the MOT Challenge

License

Notifications You must be signed in to change notification settings

amitgalor18/STC_Tracker

Repository files navigation

Strong-TransCenter: Improved Multi-Object Tracking based on Transformers with Dense Representations

PWC

[Paper]

Results on MOT17 and MOT20 compared to transformer-based trackers:

(as of October 2022)

Algorithm flowchart:

Bibtex

If you find this code useful, please star the project and consider citing:

@misc{https://doi.org/10.48550/arxiv.2210.13570,
  doi = {10.48550/ARXIV.2210.13570},
  
  url = {https://arxiv.org/abs/2210.13570},
  
  author = {Galor, Amit and Orfaig, Roy and Bobrovsky, Ben-Zion},
  
  keywords = {Computer Vision and Pattern Recognition (cs.CV), FOS: Computer and information sciences, FOS: Computer and information sciences},
  
  title = {Strong-TransCenter: Improved Multi-Object Tracking based on Transformers with Dense Representations},
  
  publisher = {arXiv},
  
  year = {2022},
  
  copyright = {arXiv.org perpetual, non-exclusive license}
}

Results examples:

MOT17-03-SDP-trimmed.mp4
MOT20-06-trimmed.mp4

Environment Preparation

  1. we use anaconda to simplify the package installations, you can download anaconda (4.10.3) here: https://www.anaconda.com/products/individual
  2. you can create your conda env by doing
conda env create -n <env_name> -f eTransCenter.yml

Alternatively, you can use the added 'requirements.txt':

pip install -r requirements.txt

*Make sure to install the correct torch and torchvision versions matching your CUDA version from the pytorch website: [https://pytorch.org/get-started/previous-versions/]

  1. STC uses Deformable transformer from Deformable DETR. Therefore, we need to install deformable attention modules:
cd ./to_install/ops
sh ./make.sh
# unit test (should see all checking is True)
python test.py

4.for the up-scale and merge module in TransCenter, we use deformable convolution module, you can install it with:

cd ./to_install/DCNv2
./make.sh         # build
python testcpu.py    # run examples and gradient check on cpu
python testcuda.py   # run examples and gradient check on gpu

See also known issues from https://github.com/CharlesShang/DCNv2. If you have issues related to cuda of the third-party modules, please try to recompile them in the GPU that you use for training and testing. The dependencies are compatible with Pytorch 1.6, cuda 10.1.

If you install the DCNv2 and Deformable Transformer packages from other implementations, please replace the corresponding files with dcn_v2.py and ms_deform_attn.py in ./toinstall for allowing half-precision operations with the customized packages.

Data Preparation

ms coco: we use only the person category for pretraining TransCenter. The code for filtering is provided in ./data/coco_person.py.

@inproceedings{lin2014microsoft,
  title={Microsoft coco: Common objects in context},
  author={Lin, Tsung-Yi and Maire, Michael and Belongie, Serge and Hays, James and Perona, Pietro and Ramanan, Deva and Doll{\'a}r, Piotr and Zitnick, C Lawrence},
  booktitle={European conference on computer vision},
  pages={740--755},
  year={2014},
  organization={Springer}
}

CrowdHuman: CrowdHuman labels are converted to coco format, the conversion can be done through ./data/convert_crowdhuman_to_coco.py.

@article{shao2018crowdhuman,
    title={CrowdHuman: A Benchmark for Detecting Human in a Crowd},
    author={Shao, Shuai and Zhao, Zijian and Li, Boxun and Xiao, Tete and Yu, Gang and Zhang, Xiangyu and Sun, Jian},
    journal={arXiv preprint arXiv:1805.00123},
    year={2018}
  }

MOT17: MOT17 labels are converted to coco format, the conversion can be done through ./data/convert_mot_to_coco.py.

@article{milan2016mot16,
  title={MOT16: A benchmark for multi-object tracking},
  author={Milan, Anton and Leal-Taix{\'e}, Laura and Reid, Ian and Roth, Stefan and Schindler, Konrad},
  journal={arXiv preprint arXiv:1603.00831},
  year={2016}
}

MOT20: MOT20 labels are converted to coco format, the conversion can be done through ./data/convert_mot20_to_coco.py.

@article{dendorfer2020mot20,
  title={Mot20: A benchmark for multi object tracking in crowded scenes},
  author={Dendorfer, Patrick and Rezatofighi, Hamid and Milan, Anton and Shi, Javen and Cremers, Daniel and Reid, Ian and Roth, Stefan and Schindler, Konrad and Leal-Taix{\'e}, Laura},
  journal={arXiv preprint arXiv:2003.09003},
  year={2020}
}

We also provide the filtered/converted labels:

MOT17 coco format labels: please put the annotations and annotations_onlySDP folders inside MOT17 to your MOT17 dataset root folder.

MOT20 coco format labels: please put the annotations folder inside MOT20 to your MOT20 dataset root folder.

Model Zoo

For main TransCenterV2 transformer model:

PVTv2 pretrained: pretrained model from deformable-DETR.

MOT17_trained_with_CH: model trained on CrowdHuman and MOT17 trainset.

MOT20_trained_with_CH: model trained on CrowdHuman and MOT20 trainset.

For embedding network fastReID model:

MOT17_SBS_S50: model trained on mot17 train set

MOT20_SBS_S50: model trained on mot20 train set

Please put all the pretrained models to ./model_zoo, and the PVTv2 model in .model_zoo/pvtv2_backbone

For Training, see the original TransCenterV2 for instructions: TransCenter

Tracking

###Using Private detections:

  • MOT17:
cd STC_Tracker
python ./tracking/mot17_private_test.py --data_dir=YourPathTo/MOT17/
  • MOT20:
cd STC_Tracker
python ./tracking/mot20_private_test.py --data_dir=YourPathTo/MOT20/

###Using Public detections:

  • MOT17:
cd STC_Tracker
python ./tracking/mot17_pub_test.py --data_dir=YourPathTo/MOT17/
  • MOT20:
cd STC_Tracker
python ./tracking/mot20_pub_test.py --data_dir=YourPathTo/MOT20/

You may also run the inference on a single file from the dataset, e.g.:

python ./tracking/mot17_private.py --data_dir YourPathTo/MOT17/ --output_dir mot17_results_dir_name --custom MOT17-02-SDP

MOTChallenge Results

MOT17 public detections:

Tracker HOTA MOTA MOTP IDF1 FP FN IDSW
TransCenterV2 56.7% 75.9% 81.2% 66.0% 30,220 100,995 4,622
STC_Tracker 59.5% 75.8% 81.3% 70.8% 33,833 99,074 3,787

MOT20 public detections:

Tracker HOTA MOTA MOTP IDF1 FP FN IDSW
TransCenterV2 50.1% 72.8% 81.0% 57.6% 28,012 110,274 2,620
STC_Tracker 56.1% 73.0% 80.9% 67.6% 30,880 106,876 2,172

MOT17 private detections:

Tracker HOTA MOTA MOTP IDF1 FP FN IDSW
TransCenterV2 56.7% 76.2% 81.1% 65.5% 40,107 88,827 5,397
STC_Tracker 59.8% 75.8% 81.1% 70.9% 44,952 87,039 4,533

MOT20 private detections:

Tracker HOTA MOTA MOTP IDF1 FP FN IDSW
TransCenterV2 50.2% 72.9% 81.0% 57.8% 28,588 108,950 2,620
STC_Tracker 56.3% 73.0% 81.0% 67.5% 30,215 107,701 2,011

Note:

  • Results from the original TransCenterV2 code were submitted independently for comparison on the same environment, using the same model version, pretrained on CrowdHuman
  • The results can be slightly different depending on the running environment.
  • We might keep updating the results in the near future.

Acknowledgement

The code for STC Tracker is modified and network pre-trained weights are obtained from the following repositories:

  1. The main framework code is derived from TransCenter
  2. The PVTv2 backbone pretrained models from PVTv2.
  3. The data format conversion code is modified from CenterTrack.
  4. The Kalman filter implementation is modified from StrongSORT
  5. The embedding network code is modified from fastReID
  6. The embedding network trained network and association implementation is from BoT-SORT

TransCenter, CenterTrack, Deformable-DETR, Tracktor, BoT-SORT, FastReID, StrongSORT.

@article{xu2021transcenter,
      title={TransCenter: Transformers with Dense Representations for Multiple-Object Tracking}, 
      author={Yihong Xu and Yutong Ban and Guillaume Delorme and Chuang Gan and Daniela Rus and Xavier Alameda-Pineda},
      year={2021},
      eprint={2103.15145},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

@article{Aharon2022,
   author = {Nir Aharon and Roy Orfaig and Ben-Zion Bobrovsky},
   journal = {arXiv},
   month = {6},
   title = {BoT-SORT: Robust Associations Multi-Pedestrian Tracking},
   year = {2022},
}
@article{FastReID,
   author = {Lingxiao He and Xingyu Liao and Wu Liu and Xinchen Liu and Peng Cheng and Tao Mei},
   journal = {arXiv},
   month = {6},
   title = {FastReID: A Pytorch Toolbox for General Instance Re-identification},
   year = {2020},
}
@article{StrongSORT,
   author = {Yunhao Du and Yang Song and Bo Yang and Yanyun Zhao},
   journal = {arXiv},
   month = {2},
   title = {StrongSORT: Make DeepSORT Great Again},
   year = {2022},
}

@article{zhou2020tracking,
  title={Tracking Objects as Points},
  author={Zhou, Xingyi and Koltun, Vladlen and Kr{\"a}henb{\"u}hl, Philipp},
  journal={ECCV},
  year={2020}
}

@InProceedings{tracktor_2019_ICCV,
author = {Bergmann, Philipp and Meinhardt, Tim and Leal{-}Taix{\'{e}}, Laura},
title = {Tracking Without Bells and Whistles},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {October},
year = {2019}}

@article{zhu2020deformable,
  title={Deformable DETR: Deformable Transformers for End-to-End Object Detection},
  author={Zhu, Xizhou and Su, Weijie and Lu, Lewei and Li, Bin and Wang, Xiaogang and Dai, Jifeng},
  journal={arXiv preprint arXiv:2010.04159},
  year={2020}
}

@article{zhang2021bytetrack,
  title={ByteTrack: Multi-Object Tracking by Associating Every Detection Box},
  author={Zhang, Yifu and Sun, Peize and Jiang, Yi and Yu, Dongdong and Yuan, Zehuan and Luo, Ping and Liu, Wenyu and Wang, Xinggang},
  journal={arXiv preprint arXiv:2110.06864},
  year={2021}
}

@article{wang2021pvtv2,
  title={Pvtv2: Improved baselines with pyramid vision transformer},
  author={Wang, Wenhai and Xie, Enze and Li, Xiang and Fan, Deng-Ping and Song, Kaitao and Liang, Ding and Lu, Tong and Luo, Ping and Shao, Ling},
  journal={Computational Visual Media},
  volume={8},
  number={3},
  pages={1--10},
  year={2022},
  publisher={Springer}
}

Several modules are from:

MOT Metrics in Python: py-motmetrics

Soft-NMS: Soft-NMS

DETR: DETR

DCNv2: DCNv2

PVTv2: PVTv2

ByteTrack: ByteTrack

About

Multiple Object Tracking framework based on transformers for the MOT Challenge

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published