Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop projections to C++ and add cone program refinement #39

Closed
wants to merge 4 commits into from

Conversation

bamos
Copy link
Collaborator

@bamos bamos commented Aug 28, 2020

Hi, here's the initial version of dropping the projections to C++ and adding the refinement from this paper. What do you all think?

  • I replaced pi to call into C++ and moved the old Python version to pi_python
  • Everything in tests.py works with the C++ projections
  • I tested the refinement code by comparing the iterates to the numba version here. They consistently match and this is consistently able to improve the results. I don't currently have any examples/tests in here but can add one. The runtime with eigen can be somewhat slow compared to the SCS call but long-term maybe this could be optimized
  • I moved the pre-computations into the derivative and adjoint_derivative functions to better-compare the runtimes here. Something else I noticed is that we can improve the runtime a bit by not returning M back up to Python, so I created _solve_adjoint_derivative_lsqr. We can also do the same thing for the derivative, or can revert this if you all want to keep the pre-computations in solve
  • Running prof.py from this PR on the old code and here shows this PR doesn't slow down anything on this example and gives a slight performance boost on the adjoint derivative from solve_adjoint_derivative_lsqr. In the "old code" I also moved the pre-computations into the derivative and adjoint_derivative for a fair comparison here
forward_time adjoint_time adjoint_derivative_time
Old code: 0.1274257183074951 2.0571622371673586 1.5252753496170044
This PR: 0.1262342929840088 2.0498327732086183 1.100463008880615

@bamos bamos requested review from akshayka and sbarratt August 28, 2020 17:27
Comment on lines +294 to +306
Q = sparse.bmat([
[None, A.T, np.expand_dims(c, - 1)],
[-A, None, np.expand_dims(b, -1)],
[-np.expand_dims(c, -1).T, -np.expand_dims(b, -1).T, None]
])

D_proj_dual_cone = _diffcp.dprojection(v, cones_parsed, True)
if mode == "dense":
Q_dense = Q.todense()
M = _diffcp.M_dense(Q_dense, cones_parsed, u, v, w)
MT = M.T

pi_z = pi(z, cones)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The derivative callable could conceivably be called more than once for a single solve; in such cases, wouldn't recomputing Q, D_proj_dual_cone be wasteful (or have I missed something)? I'd prefer pre-computing these quantities and capturing them via lexical closure as before. Otherwise, lazy computation of these quantities would also be fine (i.e., they're computed on the first call to derivative, and re-used thereafter). I have the same comment for adjoint_derivative.


pi_z = pi(z, cones)

M = _diffcp.M_operator(Q, cones_parsed, u, v, w)
Copy link
Member

@akshayka akshayka Aug 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these lines supposed to be in an else clause, accompanying line 301 (they overwrite M and MT from lines 303-34)?

Comment on lines +146 to +157
def pi(x, cones, dual=False):
"""Projects x onto product of cones (or their duals)
Args:
x: NumPy array (with PSD data formatted in SCS convention)
cones: list of (cone name, size)
dual: whether to project onto the dual cone
Returns:
NumPy array that is the projection of `x` onto the (dual) cones
"""

cone_list_cpp = parse_cone_dict_cpp(cones)
return projection(x, cone_list_cpp, dual)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool! might as well delete 'pi_python if it's no longer used.

@akshayka
Copy link
Member

Thanks for the PR @bamos! It's great that you've moved pi into C++ ... the less time spent in Python (holding the GIL!), the better :)

I did leave a couple comments --- in particular, I'm not sure I understand the motivation for moving the pre-computations for derivative and adjoint_derivative into the returned callables.

I haven't had a chance to review the refinement code. What do you think about splitting this PR into two separate ones --- one that moves pi to C++ (which will be simple for us to review) and one that adds the refinement (which will take longer for us to review).

@bamos
Copy link
Collaborator Author

bamos commented Sep 2, 2020

I haven't had a chance to review the refinement code. What do you think about splitting this PR into two separate ones --- one that moves pi to C++ (which will be simple for us to review) and one that adds the refinement (which will take longer for us to review).

Yes, sounds good. I'll close this PR and send in separate PRs for the projections and refinement separately

I did leave a couple comments --- in particular, I'm not sure I understand the motivation for moving the pre-computations for derivative and adjoint_derivative into the returned callables.

I noticed that not returning M back up to Python at all (and calling into the _solve_adjoint_derivative_lsqr I've added here) is significantly faster than returning M and then passing that back down to C++. I moved the other pre-computations just for consistency, but I'll keep this part out of my new PRs. It's still worth looking closer into this --- I think it may also be a bit faster to never construct M and just pass around A,b,c.

@bamos bamos closed this Sep 2, 2020
@bamos bamos mentioned this pull request Sep 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants