Implement the generalized normal distribution #3133

jachymb · 2024-12-07T21:47:22Z

Description

Implement Generalized normal distribution https://en.wikipedia.org/wiki/Generalized_normal_distribution

Why this is useful?

It generalizes the normal (for p=2) and double exponential (Laplace) distribution (for p=1) and in the limit case also the uniform (p → ∞) for the shape parameter p.

These correspond to the $L_p$ norms when used for regularization and in other cases.

For examples:

y ~ normal(X*beta, sigma);

produces the criterion minimize $L_2$ norm of (X*beta-y) (a.k.a. least squares). But then

y ~ double_exponential(X*beta, sigma);

produces the criterion minimize $L_1$ norm of (X*beta-y). (a.k.a. least absolute deviations)

Similarly, in the Bayesian interpretation of ridge and LASSO,

beta ~ normal(0, lambda);
y ~ normal(X*beta, sigma);

produces the $L_2$ regularized ridge and

beta ~ double_exponential(0, lambda);
y ~ normal(X*beta, sigma);

produces the $L_1$ regularized LASSO.

Using the Generalized normal distribution would allow to conveniently use an arbitrary $L_p$ norm for the optimization criterion, or to even find the suitable value of p when used as a parameter.

jachymb · 2025-03-01T13:17:29Z

Just some notes on the corresponding formulas:

We have the density:

$f(y|\mu,\alpha, \beta) = \frac{\beta}{2\alpha \Gamma(\beta^{-1})} \exp\left(-\left(\frac{|y-\mu|}{\alpha}\right)^{\beta}\right)$

The log-density:

$\log f(y|\mu,\alpha, \beta) = \log(\beta) - \log\Gamma(\beta^{-1}) - \log(2) - \log(\alpha) -\left(\frac{|y-\mu|}{\alpha}\right)^{\beta}$

And the parital derivatives:

$\frac{\partial \log f(y|\mu,\alpha, \beta)}{\partial y} = \beta \alpha^{-\beta}|\mu-y|(\mu-y)^{-1}$

$\frac{\partial \log f(y|\mu,\alpha, \beta)}{\partial \mu} = \beta \alpha^{-\beta}|\mu-y|^{\beta}(y-\mu)^{-1} = -\frac{\partial \log f(y|\mu,\alpha, \beta)}{\partial y}$

$\frac{\partial \log f(y|\mu,\alpha, \beta)}{\partial \alpha} = \alpha^{-1}\left(\beta \alpha^{-\beta}|\mu-y|^{\beta} - 1\right)$

$\frac{\partial \log f(y|\mu,\alpha, \beta)}{\partial \beta} = \alpha^{-\beta}|\mu-y|^{\beta} \log(|\mu-y|^{-1}\alpha) + \beta^{-1} + \mathrm{digamma}(\beta^{-1})$

We may factor the following subexpressions to simplify computations:

$t_1 = \alpha^{-1}$
$t_2 = \beta^{-1}$
$t_3 = y - \mu$
$t_4 = |t_3|$
$t_5 = (t_1 t_4)^{\beta}$
$t_6 = \beta t_5$

We can then express the partials as:

$\frac{\partial \log f(y|\mu,\alpha, \beta)}{\partial y} = -\frac{t_6}{t_3}$

$\frac{\partial \log f(y|\mu,\alpha, \beta)}{\partial \mu} = \frac{t_6}{t_3}$

$\frac{\partial \log f(y|\mu,\alpha, \beta)}{\partial \alpha} = t_1(t_6 -1) = t_1 t_6 - t_1$

$\frac{\partial \log f(y|\mu,\alpha, \beta)}{\partial \beta} = t_5 \log\left(t_1 t_4\right) + t_2 + \mathrm{digamma}(t_2)$

SteveBronder · 2025-03-05T19:21:58Z

I'm putting this into wolfram alpha and getting different gradient results. Can you double check these results?

wolfram link

jachymb · 2025-03-05T20:02:42Z

The formulas above have some mistakes that I noticed later, also using wolfram. The partials in the source code I believe to be correct though. Shoud I elaborate more here in the comments? Dne st 5. 3. 2025 20:22 uživatel Steve Bronder ***@***.***> napsal:

…

I'm putting this into wolfram alpha and getting different gradient results. Can you double check these results? wolfram link <https://www.wolframalpha.com/input?i2d=true&i=Assuming%5C%2891%29%CE%B1+%3E+0+%26%26+%CE%B2+%3E+0%5C%2844%29+D%5Blog%5C%2840%29Divide%5B%CE%B2%2C2*%CE%B1*gamma%5C%2840%29Divide%5B1%2C%CE%B2%5D%5C%2841%29%5D*exp%5C%2840%29-Power%5B%5C%2840%29Divide%5B%7Cy-%CE%BC%7C%2C%CE%B1%5D%5C%2841%29%2C%CE%B2%5D%5C%2841%29%5C%2841%29%2Cy%5D%5C%2893%29> — Reply to this email directly, view it on GitHub <#3133 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAMUCQN66BDNHF7YGTQ5N2L2S5FGXAVCNFSM6AAAAABTGSHILWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDOMBRHA3DAOBZGY> . You are receiving this because you authored the thread.Message ID: ***@***.***> [image: SteveBronder]*SteveBronder* left a comment (stan-dev/math#3133) <#3133 (comment)> I'm putting this into wolfram alpha and getting different gradient results. Can you double check these results? wolfram link <https://www.wolframalpha.com/input?i2d=true&i=Assuming%5C%2891%29%CE%B1+%3E+0+%26%26+%CE%B2+%3E+0%5C%2844%29+D%5Blog%5C%2840%29Divide%5B%CE%B2%2C2*%CE%B1*gamma%5C%2840%29Divide%5B1%2C%CE%B2%5D%5C%2841%29%5D*exp%5C%2840%29-Power%5B%5C%2840%29Divide%5B%7Cy-%CE%BC%7C%2C%CE%B1%5D%5C%2841%29%2C%CE%B2%5D%5C%2841%29%5C%2841%29%2Cy%5D%5C%2893%29> — Reply to this email directly, view it on GitHub <#3133 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAMUCQN66BDNHF7YGTQ5N2L2S5FGXAVCNFSM6AAAAABTGSHILWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDOMBRHA3DAOBZGY> . You are receiving this because you authored the thread.Message ID: ***@***.***>

jachymb · 2025-03-10T10:51:30Z

I posted the corrected formulas in the #3157 PR commentary

jachymb mentioned this issue Mar 4, 2025

Generalized normal distribution #3157

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement the generalized normal distribution #3133

Implement the generalized normal distribution #3133

jachymb commented Dec 7, 2024

jachymb commented Mar 1, 2025 •

edited

Loading

SteveBronder commented Mar 5, 2025

jachymb commented Mar 5, 2025 via email

jachymb commented Mar 10, 2025

Implement the generalized normal distribution #3133

Implement the generalized normal distribution #3133

Comments

jachymb commented Dec 7, 2024

Description

Why this is useful?

jachymb commented Mar 1, 2025 • edited Loading

SteveBronder commented Mar 5, 2025

jachymb commented Mar 5, 2025 via email

jachymb commented Mar 10, 2025

jachymb commented Mar 1, 2025 •

edited

Loading