Constraint-based-Clustering

This repository provides code as described in our paper Constraint-Based Hierarchical Cluster Selection in Automotive Radar Data. It integrates cluster-level constraints into the hierarchical clustering algorithm HDBSCAN.

The application of a distance threshold (see our paper A Hybrid Approach To Hierarchical Density-based Cluster Selection) is already integrated into the existing Python implementation by McInnes et al., see the documentation.

The code in this repository is based on the same HDBSCAN implementation. The most important modifications were made in _hdbscan_tree.pyx. In addition, we modified hdbscan_.py in order to allow different parameters and return values.

Please note that the uploaded code includes hard-coded values for radar data constraints as described in our paper. Further below, you will find some tips how to create a customized constraint-based HDBSCAN version.

The file hdbscan_constraint_radar.py runs our HDBSCAN version for four labelled nuscenes traffic scenes (nuscenes_data_labeled) as described in the paper. To run a different scene or change the cluster selection method, you need to adjust the parameters in __main__ accordingly. Further, the CONFIG object helps to adjust some configurations, such as plotting results.

Prerequisites

pip install -r requirements.txt

Aside from packages for the actual HDBSCAN installation, requirements.txt lists some packages needed for running hdbscan_constraint_radar.py. If you want to run hdbscan_constraint_radar.py with visualization of condensed hierarchy trees (see CONFIG options), please note that you need to install PyGraphviz separately.

Installation

The HDBSCAN setup file from the original repository is used with modified directory paths.

python setup.py build_ext --inplace

In case you want to create your own constraint-based version and make changes to _hdbscan_tree.pyx in the hdbscan_constraint folder, make sure to re-run this command afterwards.

Customization

In our constraint-based version, HDBSCAN is used as follows:

import hdbscan_constraint
[...]
clusterer = hdbscan_constraint.HDBSCAN(min_cluster_size=minPts,cluster_selection_epsilon=epsilon, cluster_selection_method='constraint+e', allow_single_cluster=allow_single_cluster, prelabels=prelabels, velocities=velocities, xy=xy, directions=directions) 
cluster_labels, alternatives = clusterer.fit_predict(data)

Compared to the original HDBSCAN, we pass additional parameters (prelabels, velocities, xy, directions) and also use custom selection method names (in this case, constraint+e). All of this can be modified in the hdbscan_.py file according to your needs. For example, instead of an array of velocity values you might want to pass an array with values of some other type that can then be assigned to the cluster candidates within the condended tree and later used to decide about the selection of clusters.

In the current implemention, we pass our parameters directly to the condense_tree function from _hdbscan_tree (see _hdbscan_tree.pyx). We modified this function such that it returns a tuple with both the tree structure and the constraints as a separate list.

Several variations of this approach are possible. For example, instead of computing constraints directly during creation of the condensed cluster tree, you could compute them separately, similarly as it is done by the functions compute_stability and compute_b3f_measure.

You might also want to introduce additional parameters to pass threshold values, instead of hard-coding them into _hdbscan_tree. Have a look at _tree_to_labels in hdbscan_.py to see how values are passed to and from _hdbscan_tree. In our case, we pass the list of constraints we obtained from condense_tree to get_clusters, where the different cluster selection methods are applied. We modified the return values of this function such that alternative labels can later be returned by fit_predict in hdbscan_.py.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Constraint-based-Clustering

Prerequisites

Installation

Customization

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.idea		.idea
hdbscan_constraint		hdbscan_constraint
nuscenes_data_labeled		nuscenes_data_labeled
README.md		README.md
hdbscan_constraint_radar.py		hdbscan_constraint_radar.py
requirements.txt		requirements.txt
setup.py		setup.py

Fusion-Goettingen/Sensors_2021_Malzer_Constraint-based-Clustering

Folders and files

Latest commit

History

Repository files navigation

Constraint-based-Clustering

Prerequisites

Installation

Customization

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages