Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support sparse matrices in odin scripts #313

Open
thibautjombart opened this issue Apr 29, 2024 · 13 comments
Open

Support sparse matrices in odin scripts #313

thibautjombart opened this issue Apr 29, 2024 · 13 comments

Comments

@thibautjombart
Copy link
Contributor

Meta-population models with connectivity between patches typically require a matrix specifying proportions of the forces of infection going from patches to patches (example here). Such matrices can grow rather large as they scale as the square of the number of patches, so that memory allocation alone can take substantial time compared to running simulations, and can become a limiting factor. A solution would be to implement support for sparse matrices. Has this been considered before? How difficult do you think this would be to implement?

@richfitz
Copy link
Member

richfitz commented May 8, 2024

Do you have profiles for this, especially from a case where this was a limiting factor? The generated odin code should be fairly easy to modify to swap in a proposed sparse matrix implementation for comparison that would help prioritise this.

@thibautjombart
Copy link
Contributor Author

thibautjombart commented May 20, 2024

Thanks for your reply. I have not gone all the way to proposing a C implementation (bit out of my depth) but here is a reprex illustrating the scaling issue. Sorry it is long but I included odin code for 2 models:

  1. a spatialised SIR model where the 'delta' matrix indicating diffusion of the force of infection across patches is fully specified; I will use it with a diagonal delta so it is as sparse as possible; in practice diffusion matrices could be nearly as sparse e.g. each cell having 2-4 neighbours (when patches are pixels on a pixmap); using a non-sparse matrix as essentially 2 costs: high RAM requirements, and wasted computational times multiplying a lot of terms by 0
  2. a non-spatial SIR model, doing the same as above but without delta; this serves as a reference of how quickly things could run for fully sparse matrices

Reprex below:

library(odin)
if (!require(peakRAM)) install.packages("peakRAM")
#> Loading required package: peakRAM

# odin model: metapopulation, frequency-dependent SIR model
# 
# Arguments passed to .$new() include:
# 
# n_patches: the total number of patches
# delta: a matrix of spatial diffusion of FOI from/to patches (dim: n_patches x n_patches)
# pop_sizes: total population sizes (length: n_patches)
# ini_infected: initial number of infected (length: n_patches)

spatial_sir_txt <- 
"
# compartment updates
update(S[]) <- S[i] - n_SI[i]
update(I[]) <- I[i] + n_SI[i] - n_IR[i]
update(R[]) <- R[i] + n_IR[i]

# compute forces of infection
# note: if delta is sparse, wasted computational and RAM allocation time here
lambda_prod[, ] <- delta[i, j] * I[i] / N[i]
lambda[] <- sum(lambda_prod[, i]) * beta # sum by column

# individual probabilities of transition
p_SI[] <- 1 - exp(-lambda[i]) # S to I
p_IR[] <- 1 - exp(-gamma) # I to R

# draws from binomial distributions for numbers changing between
# compartments:
n_SI[] <- rbinom(S[i], p_SI[i])
n_IR[] <- rbinom(I[i], p_IR[i])

# Total population size
N[] <- S[i] + I[i] + R[i]

# Initial states:
initial(S[]) <- pop_size[i] - ini_infected[i]
initial(I[]) <- ini_infected[i]
initial(R[]) <- 0

## Initialisations and dimensions
pop_size[] <- user()
ini_infected[] <- user()
beta <- user(0.2)
gamma <- user(0.1)
delta[,] <- user()

n_patches <- user(1)
dim(pop_size) <- n_patches
dim(ini_infected) <- n_patches
dim(S) <- n_patches
dim(I) <- n_patches
dim(R) <- n_patches
dim(N) <- n_patches
dim(p_SI) <- n_patches
dim(p_IR) <- n_patches
dim(n_SI) <- n_patches
dim(n_IR) <- n_patches
dim(delta) <- c(n_patches, n_patches)
dim(lambda_prod) <- c(n_patches, n_patches)
dim(lambda) <- n_patches
"


basic_sir_txt <- 
"
# compartment updates
update(S[]) <- S[i] - n_SI[i]
update(I[]) <- I[i] + n_SI[i] - n_IR[i]
update(R[]) <- R[i] + n_IR[i]

# compute forces of infection
lambda[] <- beta * I[i] / N[i]

# individual probabilities of transition
p_SI[] <- 1 - exp(-lambda[i]) # S to I
p_IR[] <- 1 - exp(-gamma) # I to R

# draws from binomial distributions for numbers changing between
# compartments:
n_SI[] <- rbinom(S[i], p_SI[i])
n_IR[] <- rbinom(I[i], p_IR[i])


# Total population size
N[] <- S[i] + I[i] + R[i]

# Initial states:
initial(S[]) <- pop_size[i] - ini_infected[i]
initial(I[]) <- ini_infected[i]
initial(R[]) <- 0

## Initialisations and dimensions
pop_size[] <- user()
ini_infected[] <- user()
beta <- user(0.2)
gamma <- user(0.1)

n_patches <- user(1)
dim(pop_size) <- n_patches
dim(ini_infected) <- n_patches
dim(S) <- n_patches
dim(I) <- n_patches
dim(R) <- n_patches
dim(N) <- n_patches
dim(p_SI) <- n_patches
dim(p_IR) <- n_patches
dim(n_SI) <- n_patches
dim(n_IR) <- n_patches
dim(lambda) <- n_patches
"




# model generators
spatial_sir <- odin(spatial_sir_txt)
#> Loading required namespace: pkgbuild
#> Generating model in c
#> ℹ Re-compiling odin96e90a8a (debug build)
#> ── R CMD INSTALL ───────────────────────────────────────────────────────────────
#>   ─  installing *source* package 'odin96e90a8a' ...
#>      ** using staged installation
#>      ** libs
#>      using C compiler: 'gcc.exe (GCC) 12.3.0'
#>      gcc  -I"C:/PROGRA~1/R/R-43~1.3/include" -DNDEBUG     -I"C:/RBuildTools/4.3/x86_64-w64-mingw32.static.posix/include"     -O2 -Wall -gdwarf-2 -mfpmath=sse -msse2 -mstackrealign  -UNDEBUG -Wall -pedantic -g -O0 -c odin.c -o odin.o
#>      gcc  -I"C:/PROGRA~1/R/R-43~1.3/include" -DNDEBUG     -I"C:/RBuildTools/4.3/x86_64-w64-mingw32.static.posix/include"     -O2 -Wall -gdwarf-2 -mfpmath=sse -msse2 -mstackrealign  -UNDEBUG -Wall -pedantic -g -O0 -c registration.c -o registration.o
#>      gcc -shared -static-libgcc -o odin96e90a8a.dll tmp.def odin.o registration.o -LC:/RBuildTools/4.3/x86_64-w64-mingw32.static.posix/lib/x64 -LC:/RBuildTools/4.3/x86_64-w64-mingw32.static.posix/lib -LC:/PROGRA~1/R/R-43~1.3/bin/x64 -lR
#>      installing to C:/Users/thiba/AppData/Local/Temp/RtmpQdYUmU/devtools_install_77c73eb784b/00LOCK-file77c67194186/00new/odin96e90a8a/libs/x64
#>   ─  DONE (odin96e90a8a)
#> 
#> ℹ Loading odin96e90a8a
basic_sir <- odin(basic_sir_txt)
#> Generating model in c
#> ℹ Re-compiling odin88290f56 (debug build)
#> ── R CMD INSTALL ───────────────────────────────────────────────────────────────
#>   ─  installing *source* package 'odin88290f56' ...
#>      ** using staged installation
#>      ** libs
#>      using C compiler: 'gcc.exe (GCC) 12.3.0'
#>      gcc  -I"C:/PROGRA~1/R/R-43~1.3/include" -DNDEBUG     -I"C:/RBuildTools/4.3/x86_64-w64-mingw32.static.posix/include"     -O2 -Wall -gdwarf-2 -mfpmath=sse -msse2 -mstackrealign  -UNDEBUG -Wall -pedantic -g -O0 -c odin.c -o odin.o
#>      gcc  -I"C:/PROGRA~1/R/R-43~1.3/include" -DNDEBUG     -I"C:/RBuildTools/4.3/x86_64-w64-mingw32.static.posix/include"     -O2 -Wall -gdwarf-2 -mfpmath=sse -msse2 -mstackrealign  -UNDEBUG -Wall -pedantic -g -O0 -c registration.c -o registration.o
#>      gcc -shared -static-libgcc -o odin88290f56.dll tmp.def odin.o registration.o -LC:/RBuildTools/4.3/x86_64-w64-mingw32.static.posix/lib/x64 -LC:/RBuildTools/4.3/x86_64-w64-mingw32.static.posix/lib -LC:/PROGRA~1/R/R-43~1.3/bin/x64 -lR
#>      installing to C:/Users/thiba/AppData/Local/Temp/RtmpQdYUmU/devtools_install_77c2766711/00LOCK-file77c13472594/00new/odin88290f56/libs/x64
#>   ─  DONE (odin88290f56)
#> 
#> ℹ Loading odin88290f56

Helper functions to create SIR models of different sizes and time the model
creation. Note we use 2 separate functions as model inputs differ.

@param n number of patches of the model

@param t number of time steps to run the model for

time_spatial_sir <- function(n, t = 10) {
  peakRAM::peakRAM(
    spatial_sir$new(
      n_patches = n, 
      pop_size = rep(1000, n), 
      ini_infected = rep(1, n), 
      delta = diag(n)
    )$run(t)
  )
}

time_basic_sir <- function(n, t = 10) {
  peakRAM::peakRAM(
    basic_sir$new(
      n_patches = n, 
      pop_size = rep(1000, n), 
      ini_infected = rep(1, n)
    )$run(t)
  )
}

# Size of the models in number of patches
model_sizes <- c(10, 100, 1000, 2000, 10000, 15000)

# note: size of the delta matrix for larger models:
# 2e4 patches: 3 Gb
# 3e4 patches: 6.7 Gb
# 4e4 patches: 11.9 Gb
gc()
#>           used (Mb) gc trigger (Mb) max used (Mb)
#> Ncells  948871 50.7    1811591 96.8  1447368 77.3
#> Vcells 1708552 13.1    8388608 64.0  2699841 20.6
results_spatial <- lapply(
  model_sizes,
  time_spatial_sir
  )
gc()
#>           used (Mb) gc trigger   (Mb)  max used   (Mb)
#> Ncells  955403 51.1    1811591   96.8   1811591   96.8
#> Vcells 1722866 13.2  210634364 1607.1 228977115 1747.0
results_basic <- lapply(
  model_sizes,
  time_basic_sir
)

# arrange and plot results
results_spatial <- Reduce(rbind.data.frame, results_spatial)
results_spatial$type <- "Spatial"
results_basic <- Reduce(rbind.data.frame, results_basic)
results_basic$type <- "Basic"
results <- rbind(results_spatial, results_basic)
results$size <- rep(model_sizes, 2)
 
library(ggplot2)

ggplot(results, aes(x = size, y = Elapsed_Time_sec, color = type)) + 
  geom_point(size = 3) + 
  geom_line() + 
  theme_bw() + 
  labs(
    x = "Number of patches", 
    y = "Runtime (s)",
    title = "Runtimes of the 2 implementations"
    )

ggplot(results, aes(x = size, y = Peak_RAM_Used_MiB, color = type)) + 
  geom_point(size = 3) + 
  geom_line() + 
  theme_bw() + 
  labs(
    x = "Number of patches", 
    y = "Peak RAM use (MiB)",
    title = "RAM use of the 2 implementations"
  )

Created on 2024-05-20 with reprex v2.1.0

@thibautjombart
Copy link
Contributor Author

I am not sure if R has a canonical data representation for sparse matrices; the 'Matrix' package seems like the likely candidate according to this post. It seems Matrix's matrix multiplication is implemented in src/matmult.c. Worth cloning if only to remember that subversion and R-forge are a thing.

@thibautjombart
Copy link
Contributor Author

@richfitz Just a gentle poke on this. If there are ways I can help move this up the priority list, do let me know.

@richfitz
Copy link
Member

I don't see me getting to this until well into next year at the rate things are going I'm afraid.

Something that would help would be to try and write a model by hand using dust2 that shows the sparse matrix library that you like in action. Docs for that package will be available sometime early next year I think, but the interface there is mostly stable.

@richfitz
Copy link
Member

Matrix multiplication in addition to generally just working with sparse matrix involves fixing longstanding issues around vector outputs (#134 and related issues). We still don't have a syntax for that, so it would not be doable until we have at least that

@thibautjombart
Copy link
Contributor Author

Got you, thanks for the quick reply. I might get to the dust2 example at some point next year, will post here if/when I do. In the meantime, if anyone has been dealing with large (100k patches and beyond) spatialized models, and has a better approach than using a cluster with TBs of RAM, I will welcome any words of wisdom.

@richfitz
Copy link
Member

That's definitely a very large model! I think getting that sort of scale working first in dust would be necessary. @thomrawson and I have been implementing a fairly large (but dense) metapopulation model directly in dust2 where the odin syntax was insufficiently flexible and it's going pretty well: https://github.com/mrc-ide/cowflu - see https://github.com/mrc-ide/cowflu/blob/main/inst/dust/cows.cpp for the core implementation. This is the sort of thing that odin generates, and it would be "straightforward" to take something like this and get it going with a sparse matrix + matrix multiplication. Then when we see how that looks and feels and how it performs, actually updating odin would be much more straightforward. Aside from docs, this all works now, and changes to the C++ API are slowing down so the code should be fairly straightforward

@thibautjombart
Copy link
Contributor Author

Many thanks for sharing - super useful! Sorry for the tangential question but: I have seen an 'odin2' repos, so wondering: are you still implementing new features into odin for the foreseable future or are you looking to switch to the new project at some point? Just thinking the timescale for moving forward on this on my end might mean planning it for odin2...

@richfitz
Copy link
Member

odin2 will become odin 1 (I'll replace the code). This will cause a lot of breaking change to the code that runs odin models but that bridge had to be crossed anyway due to the departures taken since the pandemic anyway in odin.dust. The language itself is largely unchanged. See:

@thibautjombart
Copy link
Contributor Author

Very exciting. Sorry if I missed it in the roadmap, but do you have a tentative timeframe for the replacement?

@richfitz
Copy link
Member

As soon as possible, which is unfortunately as long as a piece of string...

@richfitz
Copy link
Member

richfitz commented Nov 7, 2024

https://github.com/tensor-compiler/taco?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants