Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests and examples for the complex relative entropy cone #831

Merged
merged 23 commits into from
Feb 16, 2024
Merged

tests and examples for the complex relative entropy cone #831

merged 23 commits into from
Feb 16, 2024

Conversation

araujoms
Copy link
Contributor

@araujoms araujoms commented Dec 18, 2023

I got the tests to pass and the examples to run with the new cone (I get an error with "contraction", "semidefinitepoly", and "convexityparameter" but they are unrelated).

  1. I have to apologize for the hack in arrayutilities.jl. I couldn't figure out how to get the functions smat_to_svec! and svec_to_smat! to dispatch correctly with complex Hermitian JuMP variables. I hope you can tell me how to do that so I can fix it.
  2. I couldn't figure out what you were trying to do in the example "entanglementassisted", but it was giving the wrong answer anyway, so I rewrote it to something I do understand.
  3. Now JuMP accepts models with arbitrary types, so I modified these three examples accordingly. I couldn't figure out, however, how to get them to actually run. I assume for you this will be trivial.
  4. I've added the asymptotic expansion for the central points; it's probably useless but nice to look at.
  5. I'm rather puzzled about the interface of the cone: to get the relative entropy D(W||V) I have to call the cone with the variables in the order V, W. Is there any reason for doing that? Do you mind if I change it? I wasted a lot of time debugging until I found that this was the error. Also, the order was wrong in the example "entanglementassisted". Another thing: instead of passing the dimension of the matrix as an argument, as in the PSD cone, we need to pass the total number of real parameters. Isn't that a bit inconvenient? Do you mind if I change it?

@chriscoey chriscoey self-requested a review December 19, 2023 00:29
@lkapelevich
Copy link
Collaborator

I can give a preliminary pass over your questions but Chris would need to add his input.

  1. Not sure about the best way. Someone else can chime in. The issue is
    Matrix{GenericAffExpr{ComplexF64, VariableRef}} <: AbstractMatrix{<:Complex} ->false.

  2. The entanglement assisted example was supposed to be a translation of https://github.com/hfawzi/cvxquad/blob/master/quantum_cond_entr.m
    I see you changed one of the cones from EpiPerSepSpectralCone to EpiTrRelEntropyTriCone, which shouldn't be needed (that cone was modeling the epigraph of tr(W*log(W))). Good catch for the order in the quantum relative entropy cone being incorrect. Was this possibly the only issue causing the wrong answer you mentioned?

  3. The easiest way to test an example is to comment out the examples you don't want in examples/Examples.jl and then run examples/runexamplestests.jl

  4. Nice

  5. The reason for v-before-w is that Hypatia's order of variables is epigraph, perspective, then the rest of the variables.
    So for the vector relative entropy cone, (u, v, w) follows this rule. We interpreted the variables in the matrix cone as the
    matrix analogy. All the cones accept the total dimension as "dim", so this is a consistent interface and matches MOI. Modeling packages can decide on what is "user-friendly".

@araujoms
Copy link
Contributor Author

Thanks for the answers!

  1. In the meanwhile I have read arXiv:2103.04104 so now I understand what you were doing. Indeed, fixing the order of the arguments was enough to make it give the right answer. Surprisingly, it turned out that computing the von Neumann entropy with EpiTrRelEntropyTriCone was slightly faster than with EpiPerSepSpectralCone, but I changed it back anyway.
  2. I'm sorry, I was unclear. I know how to run the examples, what I couldn't figure out is how to pass generic types as arguments to the JuMP examples through your interface; right now the JuMP examples are only run with Float64.
  3. I don't see how v is a perspective in the vector relative entropy cone; maybe you meant the logarithm cone instead? Also, in equation (60) of coneref.pdf you say that you are interleaving the elements of the vectors v and w, is this actually true? As for the dimension, indeed, I was confusing the MOI version of the PSD cone with your version.

@araujoms
Copy link
Contributor Author

Could you rebase this branch on the current master, so that it works with julia 1.10?

@lkapelevich
Copy link
Collaborator

you say that you are interleaving the elements of the vectors v and w, is this actually true

You're right, I already forgot about that :)

Could you rebase this branch on the current master, so that it works with julia 1.10?

It should be OK to merge from master

@araujoms
Copy link
Contributor Author

🤦 Yes, of course, I can just rebase it myself, I don't know what I was thinking.

@chriscoey
Copy link
Collaborator

Thanks for this, and sorry I've been too busy to review. Let's let @odow merge #832 and #833, then update this branch and push again and see if we can fix the failing tests.

@chriscoey
Copy link
Collaborator

should be ok to merge in master again for the 1.10 fixes. you will have to Pkg add JuliaFormatter before dev'ing Hypatia and then run format(".") in the root of the Hypatia repo, in order to fix any formatting issues and get the FormatCheck to pass. then push and we'll see what's going wrong with the examples tests.

@chriscoey
Copy link
Collaborator

also i'm in the process of moving the repo ownership over to the jump-dev github organization, so that it can be a bit better maintained!

@araujoms
Copy link
Contributor Author

araujoms commented Feb 1, 2024

I understand, life after academia is tough. I've merged in master and ran JuliaFormatter, let's see how it goes now.

@chriscoey
Copy link
Collaborator

thank you. the format checks now pass. the failing examples seem to just be related to the sparse PSD cone. I'll have a look into those failures later today.

@chriscoey chriscoey changed the base branch from complexquantum to master February 1, 2024 20:33
Copy link

codecov bot commented Feb 1, 2024

Codecov Report

Attention: 3 lines in your changes are missing coverage. Please review.

Comparison is base (60b01bd) 96.86% compared to head (1cf1944) 96.83%.

Files Patch % Lines
src/Cones/epirelentropy.jl 25.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #831      +/-   ##
==========================================
- Coverage   96.86%   96.83%   -0.04%     
==========================================
  Files          56       56              
  Lines        8976     9058      +82     
==========================================
+ Hits         8695     8771      +76     
- Misses        281      287       +6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@chriscoey
Copy link
Collaborator

I only see one complexquantum branch, this one. and the target of this PR is master:
image

@araujoms
Copy link
Contributor Author

araujoms commented Feb 7, 2024

Oh. Somewhere a mess was made, but there is still a complexquantum branch in the Hypatia repository, here. I should have named the branch in my fork differently.

@chriscoey
Copy link
Collaborator

right but that branch is this PR 831's branch:
image

@chriscoey
Copy link
Collaborator

Anyway, I think we should merge this as-is. Our default is to add a hessian product when we can do so. It has helped for numerics, and in some cases, speed. It has also caused some minor speed regressions in cases where numerics were already fine, but that hasn't stopped us in the past.

@araujoms
Copy link
Contributor Author

araujoms commented Feb 7, 2024

No, this PR's branch lives in my fork. The latest commit in Hypatia's complexquantum branch is aaf6669, whereas the latest commit in this PR is f08bfbf.

In any case, I'm ok with merging, I can easily disable the product oracle in my code to do more careful benchmarks, and perhaps find what is causing the slowdown.

I just wanted your opinion on the first question I asked, is there a way to get rid of my hack and get smat_to_svec! to dispatch correctly on Hermitian JuMP variables?

Also, there is another puzzling behaviour that I have observed, at some point I inadvertently added a redundant PSD constraint for one of the Hermitian matrices going into the relative entropy cone. To my surprise this did not make the solver go crazy, but instead made it faster.

@araujoms
Copy link
Contributor Author

araujoms commented Feb 7, 2024

Completely unrelated: I think equation (60) of coneref.pdf is incorrect, I checked the code and it's not interleaving the v and w vectors.

@araujoms
Copy link
Contributor Author

araujoms commented Feb 8, 2024

More careful benchmarking shows that the product oracle is innocent. The great speedup I was seeing in the easy instances was due to changing Julia from 1.9 to 1.10. Now testing everything on 1.9, I get pretty much the same time there.

dimension aaf6669 dd9978b current
0.04 0.04 0.01
0.3 0.3 0.3
1.1 1.1 1.2
2.3 5.8 5.7
27.5 27.0 27.7
40 74 45
116 194 208
359 660 638

So we see that there was a regression somewhere between aaf6669 and dd9978b making the harder instances drastically slower, whereas the product oracle made the situation a little bit better.

@chriscoey
Copy link
Collaborator

@araujoms thanks for the concrete analysis. could you try to narrow it down to the commit(s) that cause the performance degradation you see?

@araujoms
Copy link
Contributor Author

araujoms commented Feb 8, 2024

I did a binary search and found the offending commit: 676b055. That's where I added the asymptotic expansion for the central point, which delivered a closer approximation to it.

I am very confused. I was certain that there would be no noticeable difference in performance, as the nonlinear fit you were using already delivered a good approximation, and the difference from mine was only about 10^-4. If anything I would expect the performance to be slightly better, not 2 times worse!

I've now tested providing an even better approximation, correct up to 10^-6. It did improve the performance with respect to my asymptotic expansion, but it was still significantly slower than your "bad" approximation.

I don't know what to do. If simply providing the best central point gave the optimal performance I could just generate the central points for dimensions 2 to 1000 and add that as a file, leaving the rest to the asymptotic expansion. But since this is not the case, how on Earth can I find the central point that gives the optimal performance!?

@chriscoey
Copy link
Collaborator

That change you made is to the epirelentropy cone, not epitrrelentropytri. Was that intentional? I wasn't aware that you were using both cones.

@araujoms
Copy link
Contributor Author

araujoms commented Feb 8, 2024

The central points of the matrix relative entropy cone are a simple function of the central points of the vector relative entropy cone, so the code just calls them from there. I don't use the actual vector relative entropy cone for anything.

@chriscoey
Copy link
Collaborator

Got it, right. Sorry I'm on mobile so can't look more closely.

I think this situation is just a fluke. Something about your specific examples seems to make the old central point value work slightly better for larger cases. This kind of thing happens... Really we need a broad diversity of examples generated from many different problems to show that certain changes to Hypatia help or hinder performance on average. Lacking that, we just use simple math heuristics and some guesswork.

I like that you derived asymptotic values for the central point (though I didn't verify them), and you can decide whether to leave them in the PR. Maybe you just want to increase the dimension threshold to something higher. In any case, modulo what you decide there, I think this PR is done.

@araujoms
Copy link
Contributor Author

araujoms commented Feb 8, 2024

You're right, I've tested with a couple other problems, and the performance does indeed vary, sometimes it's better with a closer approximation to the central point, sometimes it's not.

Since the performance depends so critically on the central point I'd like to add the solutions for the first 1000 dimensions. I'm unsure about what to do about the file IO; normally I'd use JLD2, but I assume you're not keen on adding another dependency to Hypatia.

@chriscoey
Copy link
Collaborator

chriscoey commented Feb 8, 2024

Our experience is that performance impacts of central point precision are fairly random, as long as it's somewhat close, hence why we used pretty simple approximations. I'm not keen for this cone to have 1000 points, and I don't think it's going to make any noticeable difference in general, so long as the approximations we have already are within the vague neighborhood of the central point. If you want, for your paper, you could experiment with that and not merge it. But I'd like to keep the central points that we have in Hypatia simple and consistent across the different cones.

@araujoms
Copy link
Contributor Author

araujoms commented Feb 8, 2024

How about 100 points then? I could just save the literals in epirelentropy.jl, no file IO needed.

@chriscoey
Copy link
Collaborator

Sorry to be a pain but we don't do that level of precision for other cones - we settled on the simple step-wise approximations that we use because getting more precise than that wasn't really helpful on our benchmarks in general, and allows us to keep the code concise and maintainable. Even if you could show convincing evidence that actually hard coding 100 points is better on a variety of benchmarks from several different applications, I'd still be a littttttle reluctant. Why not just go back to our original initial points that we used before that commit that made things slower?

@chriscoey
Copy link
Collaborator

Also the initial point should (I hope) be user-specifiable without modifying Hypatia source code - so you could drop down to that in the experiments you are running without needing to fork Hypatia.

@araujoms
Copy link
Contributor Author

araujoms commented Feb 8, 2024

Fine, I just set the asymptotic expansion to kick in after dimension 300, where your nonlinear fit really breaks down. It is of little relevance in practice as dimension 300 is hardly solvable.

@chriscoey
Copy link
Collaborator

@lkapelevich @odow @araujoms would you like to review this PR before merge?

@odow
Copy link
Member

odow commented Feb 16, 2024

I don't think I understand the code enough to competently review.

@araujoms
Copy link
Contributor Author

Not before the merge. My PhD student will go over it and hopefully write the inverse Hessian oracle, but that will still take some time.

@lkapelevich
Copy link
Collaborator

Nothing from me

@chriscoey chriscoey merged commit d37c005 into jump-dev:master Feb 16, 2024
6 of 7 checks passed
@chriscoey
Copy link
Collaborator

Merged. I'll tag today

@chriscoey
Copy link
Collaborator

Thanks everyone!

@chriscoey
Copy link
Collaborator

New version 0.8.0 is tagged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants