Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

smooth_data interpolates observations and predictions differently #2225

Closed
sethaxen opened this issue Mar 21, 2023 · 3 comments · Fixed by #2300
Closed

smooth_data interpolates observations and predictions differently #2225

sethaxen opened this issue Mar 21, 2023 · 3 comments · Fixed by #2300

Comments

@sethaxen
Copy link
Member

Describe the bug
arviz.stats.stats_utils.smooth_data(y, y_hat) smooths y along the first dimension (data dimension) and y_hat along the second dimension (sample dimension).

To Reproduce
Here's an example where we construct y_hat as identical to y in every draw. While y is successfully interpolated between data points, y_hat is unchanged, since interpolation between identical draws does nothing.

In [1]: import arviz as az

In [2]: import numpy as np

In [3]: import scipy.stats

In [4]: y = scipy.stats.binom.rvs(20, 0.5, size=(10))

In [5]: y_hat = np.tile(y.T, (1000, 1)).T

In [6]: y_interp, y_hat_interp = az.stats.stats_utils.smooth_data(y, y_hat)

In [7]: y
Out[7]: array([11,  6, 10,  6, 11,  8,  8, 12,  8,  9])

In [8]: y_interp
Out[8]: 
array([ 9.42237617,  6.32506153,  9.91551262,  6.02439063, 11.02236472,
        8.03842965,  7.88687051, 11.96915287,  8.35219182,  8.17764022])

In [9]: y_hat
Out[9]: 
array([[11, 11, 11, ..., 11, 11, 11],
       [ 6,  6,  6, ...,  6,  6,  6],
       [10, 10, 10, ..., 10, 10, 10],
       ...,
       [12, 12, 12, ..., 12, 12, 12],
       [ 8,  8,  8, ...,  8,  8,  8],
       [ 9,  9,  9, ...,  9,  9,  9]])

In [10]: y_hat_interp
Out[10]: 
array([[11., 11., 11., ..., 11., 11., 11.],
       [ 6.,  6.,  6., ...,  6.,  6.,  6.],
       [10., 10., 10., ..., 10., 10., 10.],
       ...,
       [12., 12., 12., ..., 12., 12., 12.],
       [ 8.,  8.,  8., ...,  8.,  8.,  8.],
       [ 9.,  9.,  9., ...,  9.,  9.,  9.]])

Expected behavior
I'd expect y_interp and y_hat_interp[:, 0] to be identical.

Additional context
arviz v0.16.0.dev0

@sethaxen
Copy link
Member Author

@aloctavodia IIRC you implemented this smoothing. Have I correctly identified a bug here, or am I misusing the function?

@OriolAbril
Copy link
Member

There is this reshape: https://github.com/arviz-devs/arviz/blob/main/arviz/plots/backends/matplotlib/bpvplot.py#L87 which I think makes the first dimension the sample one

@OriolAbril
Copy link
Member

I am updating the docstring to include the shape info

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants