Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MultivariateNormalDiag distribution. #208

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

SiegeLord
Copy link

@SiegeLord SiegeLord commented Mar 10, 2024

This is an extremely common special case of the MultivariateNormal distribution due to its efficient sampling and log-probability computations.

This is an extremely common special case of the MultivariateNormal
distributon due to its efficient sampling and log-probability
computations.
@SiegeLord SiegeLord force-pushed the multivariate_normal_diag branch from 1a81244 to e23531f Compare March 10, 2024 05:31
@YeungOnion
Copy link
Contributor

I see thorough work here, I really appreciate the thorough tests, but I'm unsure about accepting this pr. If, for any reason rand makes a change where we'd need to modify Normal and MultivariateNormal, then we'd have more places to make changes (It makes me think I should not deeply into how we've implemented MultivariateNormal).

As I see it, I believe that your feature set can wrap distribution::Normal so that the only explicit reliance on the fact that this describes uncorrelated, normal or both would be:

  • entropy, additive for uncorrelated
  • variance is defined for each and uncorrelated means there's no covariance to consider,

please let me know if this would not be correct or if I'm missing something important.

But I see both those items being a result of being independently and identically distributed (i.i.d). Perhaps there's an ergonomic solution to cover i.i.d variables from distribution D?

Note: An ergonomic way to support different distributions from uncorrelated variables does not yet seem worthwhile to me, but I would listen to input here as well.

@SiegeLord
Copy link
Author

I see thorough work here, I really appreciate the thorough tests, but I'm unsure about accepting this pr. If, for any reason rand makes a change where we'd need to modify Normal and MultivariateNormal, then we'd have more places to make changes (It makes me think I should not deeply into how we've implemented MultivariateNormal).

That seems highly unlikely to happen, given the widespread usage of rand.

Usually the libraries reduce the duplication by abstracting away the square root of the covariance matrix (be it a cholesky factor or the diagonal scale) (e.g. MultivariateNormalLinearOperator from TensorFlow Probability and the Covariance from scipy. rv crate provides a few covariance constructors on their MVN although they don't take advantage of the speed benefits. That certainly could be an option here, but it'd have to be pretty adhoc since nalgebra gives us no help here (they don't have a type that subsumes linear operators like this). Something like fn new(mu: DVector<f64>, cov: impl Covariance) could probably be made to work.

A wrapper distribution over independent distributions is a pretty common component of statistics toolkits (e.g. Independent from TensorFlow Probability, ProductDistribution from rv), but I don't think Rust is a flexible-enough language to provide an idiomatic or efficient-enough API for it. The approach taken by rv is particularly unfortunate as it forces the construction of the components onto the user, rather that accepting vector-valued parameters of the component distributions. I don't think it's the right approach if efficiency is at all a goal for statrs.

@YeungOnion
Copy link
Contributor

YeungOnion commented Mar 23, 2024

Your first point alone,

That seems highly unlikely to happen, given the widespread usage of rand.

in tandem - which I'd really not considered - with a significant specialization relative to independent variables is enough motivator to dispel my concern.

I like the ideas you mention about how to allow for specialization of multidimensional distributions. Seems like a good idea for another PR.


EDIT
Removed remarks at the bottom about when to introduce this "breaking change". I must have been confusing this with another pr, as this is not a breaking change.

@YeungOnion
Copy link
Contributor

Took another look at this, I also think that uncorrelated multivariate normal variables are common enough to be useful, but would it make sense to have a distinct API from existing MVN? It allows a user to opt in/out for optimizations, is this intended?

I'm looking at the scipy implementation that you referenced, and it's close to what I'm thinking. What methods would a Covariance trait have that allow specifying different behavior for the pdf and sampling depending on the representation?

@SiegeLord
Copy link
Author

SiegeLord commented Apr 6, 2024

Something like this would be a good start

trait Covariance
{
	type V;
	type M;

	/// sqrt(cov) @ vec
	fn forward(&self, vec: &V) -> V;
	
	/// inv(sqrt(cov)) @ vec
	fn inverse(&self, vec: &V) -> V;
	
	/// diag_part(cov)
	fn diag(&self) -> V;
	
	/// dense(cov)
	fn dense(&self) -> M;
	
	/// abs(det(cov))
	fn abs_determinant(&self) -> f64;
	
	/// ln(abs(det(cov)))
	fn ln_abs_determinant(&self) -> f64;
}

That'll let you implement efficient sampling/log probability and also compute the variance vector and (dense) covariance matrix. Like for the current MVN implementation, the return values of many of them can be pre-computed upon construction.

For inverse/forward methods, TensorFlow calls them matvec/solve, scipy calls them colorize/whiten.

@YeungOnion
Copy link
Contributor

This looks really good! Would you be willing to implement it to support diagonal and correlations for mvn?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants