Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove pre-allocation of outputs for logpdf and pdf + deprecate implementations for multiple samples #1257

Closed
wants to merge 3 commits into from

Conversation

devmotion
Copy link
Member

This PR removes the pre-allocation of output arrays for logpdf and pdf.

There seems little reason to fall back to the in-place methods since it does neither reduce allocations nor lines of code. Instead, I'd argue that the pre-allocation causes problems since the computation of the output type is only a heuristic and it seems easy to come up with examples where it fails since typically logpdf and pdf do not ensure that their output is of this specific type. Additionally, the pre-allocations lead to calls of mutating functions which break AD with e.g. Zygote. Moreover, it is consistent with logpdf and pdf implementations for univariate distributions that do not fall back to the mutating functions but use broadcasting (one could use the same implementation with map and Base.Fix1 there as well) - however, it seems a bit strange that the use of (log)pdf with arrays is deprecated for univariate distributions but not for multi- and matrixvariate distributions.

@@ -230,15 +230,13 @@ end
function logpdf(d::MultivariateDistribution, X::AbstractMatrix)
size(X, 1) == length(d) ||
throw(DimensionMismatch("Inconsistent array dimensions."))
T = promote_type(partype(d), eltype(X))
_logpdf!(Vector{T}(undef, size(X,2)), d, X)
map(i -> _logpdf(d, view(X, :, i)), axes(X, 2))
Copy link
Member Author

@devmotion devmotion Jan 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to just use

Suggested change
map(i -> _logpdf(d, view(X, :, i)), axes(X, 2))
map(Base.Fix1(_logpdf, d), eachcol(X))

here but eachcol requires Julia >= 1.1 or Compat.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. I had forgotten that eachcol/row weren't available in 1.0.

end

function pdf(d::MultivariateDistribution, X::AbstractMatrix)
size(X, 1) == length(d) ||
throw(DimensionMismatch("Inconsistent array dimensions."))
T = promote_type(partype(d), eltype(X))
_pdf!(Vector{T}(undef, size(X,2)), d, X)
map(i -> _pdf(d, view(X, :, i)), axes(X, 2))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, with eachcol one could write

Suggested change
map(i -> _pdf(d, view(X, :, i)), axes(X, 2))
map(Base.Fix1(_pdf, d), eachcol(X))

@devmotion
Copy link
Member Author

Similar changes could (should?) be applied to rand to avoid inconsistencies such as reported in #1252.

@codecov-io
Copy link

codecov-io commented Jan 11, 2021

Codecov Report

Merging #1257 (917d436) into master (3a29bb1) will decrease coverage by 0.20%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1257      +/-   ##
==========================================
- Coverage   81.81%   81.61%   -0.21%     
==========================================
  Files         117      117              
  Lines        6589     6505      -84     
==========================================
- Hits         5391     5309      -82     
+ Misses       1198     1196       -2     
Impacted Files Coverage Δ
src/Distributions.jl 100.00% <ø> (ø)
src/deprecates.jl 0.00% <ø> (ø)
src/multivariate/mvnormal.jl 71.73% <ø> (-1.32%) ⬇️
src/multivariate/mvtdist.jl 68.88% <ø> (+7.66%) ⬆️
src/matrixvariates.jl 100.00% <100.00%> (+8.45%) ⬆️
src/multivariate/mvnormalcanon.jl 80.35% <100.00%> (-2.10%) ⬇️
src/multivariates.jl 75.00% <100.00%> (-6.58%) ⬇️
src/mixtures/mixturemodel.jl 72.83% <0.00%> (-5.77%) ⬇️
src/multivariate/dirichlet.jl 62.36% <0.00%> (-4.15%) ⬇️
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3a29bb1...917d436. Read the comment docs.

@devmotion
Copy link
Member Author

I deprecated all (log)pdf(!) methods for multiple samples, as discussed in #1257 (comment). However, maybe it would be better to separate this in a different PR since it implies that the optimization for _logpdf! in some distributions such as MvNormal are not possible anymore (since these functions don't exist anymore/are deprecated).

@devmotion devmotion changed the title No pre-allocation of outputs for logpdf and pdf Remove pre-allocation of outputs for logpdf and pdf + deprecate implementations for multiple samples Jan 13, 2021
@devmotion
Copy link
Member Author

I think this requires some additional discussion - maybe it would be better to revert the deprecations that were suggested by @andreasnoack and just fix the allocating calls for now? What's the general opinion?

Ping @matbesancon @mschauer

@andreasnoack
Copy link
Member

I'd be fine with splitting this into two PRs: one similar to your original version which could then be a bugfix release and then a PR similar to the current version that would require a breaking release but I still think it would be good to deprecate these methods.

@st--
Copy link
Contributor

st-- commented Oct 5, 2021

@devmotion should this be closed now?

@devmotion
Copy link
Member Author

Seems it was addressed in the linked PRs 👍

@devmotion devmotion closed this Oct 5, 2021
@devmotion devmotion deleted the dw/logpdf_pdf branch October 5, 2021 12:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants