Skip to content

Latest commit

 

History

History
179 lines (147 loc) · 11 KB

README.md

File metadata and controls

179 lines (147 loc) · 11 KB

PCB-RandNet: Rethinking Random Sampling for LiDAR Semantic Segmentation in Autonomous Driving Scene

Huixian ChengXianfeng Han,  Hang Jiang, Dehong He, Guoqiang Xiao
College of Computer and Information Science, Southwest University
[arXiv]


Updates

  • 2024-02-12[😋] Our paper was very lucky to be accepted by ICRA 2024.

Abstract

LiDAR point cloud sparsity and distance-dependent long-tailed distributions make Random Sampling less suitable for this scenario. To alleviate this problem, we propose Polar Cylinder Balanced Random Sampling (PCB-RS) and Sampling Consistency Loss (SCL) to optimize the point cloud distribution after down-sampling to improve the segmentation performance under different ranges (especially long range).


Environment Setup

Install python packages

conda env create -f my_env.yaml
conda activate randla

Note:

  • Not all packages are necessary, you can refer to this environment to install.
  • Since we are using operator torch.take_along_dim that are only available in pytorch version ≥ 1.9.0 in here, please make sure that the installed version meets the requirements.

Compile C++ Wrappers

bash compile_op.sh

Prepare Data

Download SemanticKITTI from official web. Download SemanticPOSS from official web. Then preprocess the data:

python data_prepare_semantickitti.py
python data_prepare_semanticposs.py

Note:

  • Please change the data path with your own path.
  • Change corresponding path self.dataset_path in all .py in dataset (such as here).

Training

  1. Training baseline with RS or PCB-RS
python train_SemanticKITTI.py <args> 

python train_SemanticPOSS.py <args>

Options:
--backbone           select the backbone to be used: choices=['randla', 'baflac', 'baaf']
--checkpoint_path    path to pretrained models(if any), otherwise train from start
--log_dir            Name of the log dir
--max_epoch          max epoch for the model to run
--batch_size         training batch size, modify to full utilize the GPU/s
--val_batch_size     batch size for validation
--num_workers        number of workers for I/O
--sampling           select the sampling way: RS or PCB-RS. choices=['random', 'polar']
--seed               set random seed
--step               set length of dataset: 0 mean use all data || 4 mean use 1/4 dataset (Only use for SemanticKITTI)
--grid               resolution of polar cylinder
  1. Training model with PCB-RS and SCL
python train_both_SemanticKITTI.py <args>

python train_both_SemanticPOSS.py <args>

Options:  Similar to before.

Test

python test_SemanticKITTI.py <args>

python test_SemanticPOSS.py <args>

Options:
--infer_type         all: infer all points in specified sequence, sub: subsamples in specified sequence
--sampling           select the sampling way: RS or PCB-RS. choices=['random', 'polar']
--backbone           select the backbone to be used: choices=['randla', 'baflac', 'baaf']
--checkpoint_path    required. path to the model to test
--test_id            sequence id to test
--result_dir         result dir of predictions
--step               set length of dataset: 0 mean use all data || 4 mean use 1/4 dataset (Only use for SemanticKITTI)
--grid               resolution of polar cylinder

Note:

  • For SemanticKITTI dataset, if your want to infer all data of 08 sequence, please make sure you have more than 32G of free RAM. If your hardware does not meet this requirement, please refer to this issue.

  • Due to the use of torch_data.IterableDataset, the num_workers needs to be set to 0 and cannot use multi-threaded acceleration, so the test speed will be very slow. Welcome to propose a feasible accelerated PR for this. (This reference may be useful)

Other Utils

  1. Evaluation with Distances
python evaluate_DIS_SemanticKITTI.py <args>     ||      python evaluate_DIS_SemanticPOSS.py <args>
  1. Visualization
python visualize_SemanticKITTI.py <args>

This code is not used, so it has not been tested and may have bugs. If you want to use this code to visualize SemanticKITTI and SemanticPOSS, please refer to this repo and this repo.

  1. Others in tool.
caculate_time.py            Use to get Time Consumption Comparison between RS, PCB-RS, and FPS.
draw.py / draw_poss.py      Use to draw performance plot at different distances.
draw_vis_compare.py         Use to generate qualitative visualization and quantitative statistical analysis images between RS and PCB-RS similar to the paper.
eval_KITTI_gap.py           Use to calculate the difference in performance of the model under different sampling methods.
main_poss.py                Use to count distance distribution

Others to Note and Clarify

  • All our experiments were trained on a single NVIDIA RTX 3090 GPU using mixed precision.
  • We set torch.backends.cudnn.enabled = False and use with torch.cuda.amp.autocast(): to reduce training time and save GPU memory. However, it is not certain whether it will have an impact on performance.
  • In train_SemanticKITTI.py, We use num_classes = 20 instead of num_classes = 19. The reason is that I think the ops used to calculate the loss in the original code is very complex, and its main point is to ignore the loss when label == 0. Therefore, I just changed the num_classes to 20 and set the weight of label == 0 to 0 to achieve it.
  • In fact, the criterion corresponding to compute_loss should be self.criterion = nn.CrossEntropyLoss(weight=class_weights, reduction='none') instead of self.criterion = nn.CrossEntropyLoss(weight=class_weights). This may affect the scale of the computed loss, but should not affect performance.
  • There are two ways to calculate SCL, one is the L1 loss used by default, and the other is method here from CPGNET.
  • In terms of loss size, the default L1 loss is better used with reduction='mean' and consistency_loss_l1 is better used with reduction='none'. Some of our experimental results show that the performance of the models trained in both settings is comparable.
  • Since the two losses used L_ce and L_scl may be correlated. Therefore the uncertainty weighting method may not converge using random initialization. In our experiments, we set ce_sigma==0.27 and l1_sigma==0.62 because they are close to our initial experimental values with seed==1024. Some of our experiments show that the model converges effectively with ce_sigma in range [0.2, 0.3] and l1_sigma in range [0.6, 0.8].
  • For BAF-LAC backbone, our implemented model is as consistent as possible with the original Repo.
  • For BAAF backbone, our implementation is slightly different from the original Repo, which will result in some performance differences. While the original Repo used FPS sampling, I changed to RS sampling to keep the framework structure consistent. The original Repo used an auxiliary loss, in order to keep simple, we did not use this part.
  • Since I' m not good at coding & optimization, the code of PCB-RS is very rough, if you are interesting in optimizing & accelerating this part, welcome to PR !

Pretrained Models and Logs:

KITTI Result POSS Result Ablation Study
Google Drive Google Drive Google Drive

Acknowledgement

Citation

If you find this work helpful, please consider citing our paper:

@article{cheng2022pcb,
  title={PCB-RandNet: Rethinking Random Sampling for LIDAR Semantic Segmentation in Autonomous Driving Scene},
  author={Cheng, Huixian and Han, XianFeng and Jiang, Hang and He, Dehong and Xiao, Guoqiang},
  journal={arXiv preprint arXiv:2209.13797},
  year={2022}
}