The paper will appear in CVPR 2018. An arXiv pre-print version is available.
The updated version is accpeted at IEEE Transactions on Pattern Analysis and Machine Intelligence. Here is arXiv pre-print version.
Please cite our paper if you are inspired by the idea.
@inproceedings{xialei2018crowd,
title={Leveraging Unlabeled Data for Crowd Counting by Learning to Rank},
author={Liu, Xialei and van de Weijer, Joost and Bagdanov, Andrew D},
booktitle={Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2018},
url = {https://github.com/xialeiliu/CrowdCountingCVPR18}
}
and
@ARTICLE{8642842,
author={X. {Liu} and J. {Van De Weijer} and A. D. {Bagdanov}},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
title={Exploiting Unlabeled Data in CNNs by Self-supervised Learning to Rank},
year={2019},
pages={1-1},
doi={10.1109/TPAMI.2019.2899857},
ISSN={0162-8828}, }
Xialei Liu, Joost van de Weijer and Andrew D. Bagdanov
Computer Vision Center, Barcelona, Spain
Media Integration and Communication Center, University of Florence, Florence, Italy
We propose a novel crowd counting approach that leverages abundantly available unlabeled crowd imagery in a learning-to-rank framework. To induce a ranking of cropped images , we use the observation that any sub-image of a crowded scene image is guaranteed to contain the same number or fewer persons than the super-image. This allows us to address the problem of limited size of existing datasets for crowd counting. We collect two crowd scene datasets from Google using keyword searches and queryby-example image retrieval, respectively. We demonstrate how to efficiently learn from these unlabeled datasets by incorporating learning-to-rank in a multi-task network which simultaneously ranks images and estimates crowd density maps. Experiments on two of the most challenging crowd counting datasets show that our approach obtains state-ofthe-art results.
The main idea of our approach is to address the problem of limited Crowd Counting dataset size, which allows us to leverage abundantly available unlabeled crowd imagery in a learning-to-rank framework.
All training and test are done in Caffe framework.
- Requirements for
caffe
andpycaffe
(see: Caffe installation instructions). Caffe must be built with support for Python layers!
# In your Makefile.config, make sure to have this line uncommented
WITH_PYTHON_LAYER := 1
- Download the pre-trained VGG-16 ImageNet model for finetuning.
The pre-trained models are available to download.
We use the code from here to download and prepare the datasets, generate the density maps and evalate the models.