Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fast implementation of maximum cardinality search ordering algorithm. #945

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

samuelsonric
Copy link

@samuelsonric samuelsonric commented Feb 24, 2025

I have a library CliqueTrees.jl with a fast implementation of the maximum cardinality algorithm. Here are some benchmarks.

using BenchmarkTools, MatrixMarket, PowerModels, SuiteSparseMatrixCollection

ssmc = ssmc_db()
name = "bcsstk17"
matrix = mmread(joinpath(fetch_ssmc(ssmc[ssmc.name .== name, :], format="MM")[1], "$(name).mtx"))

current implementation

julia> @benchmark PowerModels._mcs($matrix)
BenchmarkTools.Trial: 7 samples with 1 evaluation per sample.
 Range (min  max):  778.347 ms  849.834 ms  ┊ GC (min  max): 4.17%  6.36%
 Time  (median):     801.612 ms               ┊ GC (median):    5.25%
 Time  (mean ± σ):   803.112 ms ±  23.522 ms  ┊ GC (mean ± σ):  5.17% ± 0.72%

  █   █       █      █ █     █                                █  
  █▁▁▁█▁▁▁▁▁▁▁█▁▁▁▁▁▁█▁█▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
  778 ms           Histogram: frequency by time          850 ms <

 Memory estimate: 1.50 GiB, allocs estimate: 238040.

new implementation

julia> @benchmark PowerModels._mcs($matrix)
BenchmarkTools.Trial: 4301 samples with 1 evaluation per sample.
 Range (min  max):  1.083 ms    5.601 ms  ┊ GC (min  max): 0.00%  75.50%
 Time  (median):     1.128 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   1.160 ms ± 153.791 μs  ┊ GC (mean ± σ):  2.05% ±  6.05%

  ▆▇▆█▇▅▅▄▃▃▂▂      ▁▁▁▁                                      ▂
  ████████████████▇▇█████▇█▇▆▇▅▆▇▅▄▅▇▃▆▃▃▃▃▆▃▁▁▃▅▅▃▄▃▄▁▃▃▅▁▁▃ █
  1.08 ms      Histogram: log(frequency) by time      1.78 ms <

 Memory estimate: 514.88 KiB, allocs estimate: 18.

The new implementation is linear time, and the current one is quadratic (cubic?), so the difference will get worse with larger matrices. You can take a look at the new implementation here.

@odow
Copy link
Collaborator

odow commented Feb 24, 2025

How long does a very large example take? I think we'd be hesitant to adopt a new direct dependency just to solve a second, even though I'll admit it looks like you have done a nice job 😄

@samuelsonric
Copy link
Author

samuelsonric commented Feb 24, 2025

this matrix

name = "finan512"

current implementation

julia> @benchmark PowerModels._mcs($matrix)
BenchmarkTools.Trial: 1 sample with 1 evaluation per sample.
 Single result which took 43.177 s (8.29% GC) to evaluate,
 with a memory estimate of 71.03 GiB, over 1563700 allocations.

new implementation

julia> @benchmark PowerModels._mcs($matrix)
BenchmarkTools.Trial: 1964 samples with 1 evaluation per sample.
 Range (min  max):  2.163 ms   22.294 ms  ┊ GC (min  max): 0.00%  84.34%
 Time  (median):     2.381 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   2.541 ms ± 624.763 μs  ┊ GC (mean ± σ):  6.08% ±  9.25%

      ▂█▅▂                                                     
  ▂▃▃▄████▇▄▄▃▂▂▂▂▂▂▂▂▂▃▃▄▄▄▃▃▃▂▂▂▂▂▂▁▂▂▁▁▂▂▂▁▂▂▂▁▁▂▁▂▁▁▂▂▂▂▂ ▃
  2.16 ms         Histogram: frequency by time        4.07 ms <

 Memory estimate: 3.42 MiB, allocs estimate: 18.

@odow
Copy link
Collaborator

odow commented Feb 24, 2025

Sure, that's a nice example, but what about one that occurs in the types of models that PowerModels solves?

@samuelsonric
Copy link
Author

samuelsonric commented Feb 24, 2025

Sure, that's a nice example, but what about one that occurs in the types of models that PowerModels solves?

Oh! I see what you mean --- I'll run some tests.

CliqueTrees is a package for constructing chordal extensions; after this PR, I had planned to try optimizing the functions _chordal_extension and _maximal_cliques.

@odow
Copy link
Collaborator

odow commented Feb 24, 2025

This isn't my subject area, so I'll defer to @ccoffrin, but it looks like this code was added in #333, and I couldn't find any related issues inn the last 7 years, so it may be that the current implementation is sufficient.

I assume most matrix sizes are <10^3. Larger and we get to the point where we can't solve the underlying optimization model so the ability to build the problem becomes moot.

That's not to say that there isn't room for optimizing the implementations, but the bar for adding a new dependency is very high.

@ccoffrin
Copy link
Member

I do not want to add an external package dependency for improved performance on this, but I am open to performance improvements that only use the current set of dependencies.

This code is only used in one very specific model, an SPD relaxation taking advantage of matrix sparsity. I strongly suspect that the SDP solver runtime will dwarf the model build time for these models.

@samuelsonric
Copy link
Author

samuelsonric commented Feb 28, 2025

I do not want to add an external package dependency for improved performance on this, but I am open to performance improvements that only use the current set of dependencies.

This code is only used in one very specific model, an SPD relaxation taking advantage of matrix sparsity. I strongly suspect that the SDP solver runtime will dwarf the model build time for these models.

I tried benchmarking the instantiate_model function on the TYP models in this (quite helpful) data set, and by the time I got halfway down the list, instantiation was taking several minutes; the _mcs call is a tiny part of that.

In any case, it's just one function, so I am happy to copy in the code.

@samuelsonric
Copy link
Author

Like that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants