Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resampling a Population does not normalize #241

Open
turion opened this issue Jan 10, 2023 · 2 comments
Open

Resampling a Population does not normalize #241

turion opened this issue Jan 10, 2023 · 2 comments

Comments

@turion
Copy link
Collaborator

turion commented Jan 10, 2023

Describe the bug
When applying a resampler, the total evidence stays the same.

To Reproduce

ghci> import Control.Monad.Bayes.Population
ghci> import Control.Monad.Bayes.Sampler.Strict
ghci> let model = spawn 1000 >> normal 0 1 >>= (condition  . (>= 0))
ghci> sampleIOfixed  $ evidence model
0.48900000000000027
ghci> sampleIOfixed  $ evidence $ resampleStratified model
0.48900000000000027

Expected behavior

I'm not sure whether this is intended. It seems like it, given this line:

return $ map (,z / fromIntegral n) offsprings

Extra care is taken to make sure that the total mass from before (z) is restored, and no accidental normalization is introduced.

There used to be a function normalize, which was removed in 771ce2e by @reubenharry, as part of #142. It was part of the 0.1.1.0 release: https://hackage.haskell.org/package/monad-bayes-0.1.1.0/docs/Control-Monad-Bayes-Population.html#v:normalize

Its haddocks read:

Normalizes the weights in the population so that their sum is 1. This transformation introduces bias.

As of master, there are other ways to "normalize" the distribution, but they all require MonadFactor on the inner monad. This is sometimes not feasible. For example when doing particle filtering (see https://github.com/turion/dunai-bayes/), one needs to condition and resample on every step of the time series. Then the probability masses of the individual particles drop at an exponential rate, no matter how much we condition in the inner monad. Very soon, they shrink below 2^(-64) and cannot be resolved by a Double anymore. In these situations, I would have believed that it makes sense to normalize, so the numbers stay in a region we can calculate with.

What I don't understand are the comments about "bias". In 1d55218 it is commented by @adscib that normalize indroduces bias, and I don't understand what that means. In particular I don't understand whether normalizing at every time step in a particle filter then also introduces bias. If yes, how do other implementations of particle filters deal with this situation?

@reubenharry
Copy link
Contributor

On a statistical level, I also don't understand this comment. I have no objection to putting this function back. Standard caveat: @adscib definitely knew what he was doing when writing this, so I would strongly recommend against any change to e.g. resampleGeneric without total certainty that it won't change the semantics. My guess is that the reason normalization in resampleGeneric doesn't happen is because it needs to be a measure preserving transformation in order for various algorithms to be correct, (e.g. maybe PMMH and RMSMC).

@reubenharry
Copy link
Contributor

I think I removed it because it wasn't used anywhere in the codebase, which wasn't a very sensible decision.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants