Skip to content

Releases: JimHaughwout/GADM_DBSCAN

Sample Results documentation

30 Nov 15:12
Compare
Choose a tag to compare

Add sample results documentation to illustrate distance formula differences

Initial Release

25 Nov 21:12
Compare
Choose a tag to compare

Fun with DBSCAN algorithm and GADM-geocoded points of interest.

This reads in a data set of coordinates (latitude and longitude) along with
geocoded Global Administrative Area features
for these, then performs unsupervised learning to cluster these into
zones of interest based on geographic features using the
Density-Based Spatial Clustering of Applications with Noise
algorithm with a customized distance function.

Custom Distance Metric Modes

You can use one of three modes to calculation the distance between points for
DBSCAN clustering

vicenty-basic Mode

Custom distance metric using Vincenty's Forumla.

vicenty-gadm Mode

Custom distance metric that combines Vincenty's Forumla with GADM features
to calculate a scored distance (in km). The metric starts with a base
Vincenty's Forumal distance calculation, then modifies this based on
whether the two points are in the same city and or city neighborhood.

This is just one (illustrative) method of using GADM features to modify
distance. It is "magic numbery" for simplicity. In real-life one would
derive values for GADM feature weights -- or use the full proxy method.

proxy Mode

Custom distance metric that uses a simple proxy ID to fetch attributes
from an external data set (for illustrative simplicity in this case,
the passed POI dataset)

While this Proxy approach replicates the same distance formula
of Vincenty-plus-GADM it could be modified to support ANY distance formula.
For example, rather that using GADM features one could instead extract
a key or GUID used to look up a whole array of features used for a custom
distance calculation (even to make a REST call to a route planning system
to get true driving times between each X and Y).

Vincenty MVP

22 Nov 22:33
Compare
Choose a tag to compare
Vincenty MVP Pre-release
Pre-release

What's New

This version uses a custom distance metric function that employs true
ellipsoid distance calculations (using Vincenty's formula).

What's Next

The desire is to modify the metric calculation to employ GADM features to
change the distance calcultion (i.e., leverage urbanization vs rural features).

Known Issues

As (Lat, Lng) is actually (Y, X) matplotlib plots these with a 90-degree rotation.

Basic release

22 Nov 21:16
Compare
Choose a tag to compare
Basic release Pre-release
Pre-release

This release uses a conformal mapping approach to map ellipsoid
(latitude, longitude) coordinates to a flat Cartesian (X,Y) Plane. This allows use of sklearn's out-of-the-box
distance calculation functions.