Skip to content

Latest commit

 

History

History
30 lines (26 loc) · 1.5 KB

README.md

File metadata and controls

30 lines (26 loc) · 1.5 KB

pairwise_distance

Usage

This code takes a set of 2D data points X and calculates the sum and the mean of the pairwise Euclidean distances between the points in parallel. To call use (weights and n_jobs are optional):

parallel_sum, parallel_mean = mean_pairwise_distance(X,
                                                     weights = how_to_weight_each_X,
                                                     n_jobs = how_many_cores_to_use)

In theory it is equivalent to the following (where N = X.shape[0] and counts is an array of length N with counts per X value):

    Y = scipy.spatial.distance.pdist(X, 'euclidean')
    weights = [counts[i] * counts[j]
               for i in xrange(N - 1) for j in xrange(i + 1, N)]
    serial_sum = np.sum(weights * Y)
    serial_mean = serial_sum / (((N - 1)**2 + (N + 1)) / 2 + N)

Tests

Testing this module using pytest and memory_profiler can be easily invoked using make.

  • make test will run the basic assertion tests.
  • make memory will run the memory profiler and display a plot of the processes.
  • make save_plot will save a plot of the most recent memory profiling data as a png.
  • make clean will delete all the data files created by the memory profiler.

Authors