Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNMY: allow array in nonlinear expressions #3451

Closed
wants to merge 2 commits into from
Closed

Conversation

blegat
Copy link
Member

@blegat blegat commented Aug 15, 2023

This PR is a proof of concept to show the possibility of storing arrays in the JuMP nonlinear expression and MOI nonlinear function.
Of course, it won't work with the AD but it can be supported by a solver or by a Disciplined Convex Programming transformation.

julia> model = Model();

julia> @variable(model, X[1:3, 1:3], Symmetric)
3×3 LinearAlgebra.Symmetric{VariableRef, Matrix{VariableRef}}:
 X[1,1]  X[1,2]  X[1,3]
 X[1,2]  X[2,2]  X[2,3]
 X[1,3]  X[2,3]  X[3,3]

julia> using LinearAlgebra

julia> @constraint(model, log(det(X)) >= 0)
log(det(VariableRef[X[1,1] X[1,2] X[1,3]; X[1,2] X[2,2] X[2,3]; X[1,3] X[2,3] X[3,3]])) - 0.0  0

@codecov
Copy link

codecov bot commented Aug 15, 2023

Codecov Report

Patch coverage: 96.69% and project coverage change: -0.22% ⚠️

Comparison is base (a573de2) 98.09% compared to head (7439e4c) 97.87%.

❗ Current head 7439e4c differs from pull request most recent head 0d39e70. Consider uploading reports for the commit 0d39e70 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3451      +/-   ##
==========================================
- Coverage   98.09%   97.87%   -0.22%     
==========================================
  Files          37       37              
  Lines        5501     5418      -83     
==========================================
- Hits         5396     5303      -93     
- Misses        105      115      +10     
Files Changed Coverage Δ
src/JuMP.jl 94.87% <ø> (ø)
src/mutable_arithmetics.jl 89.18% <75.00%> (-2.71%) ⬇️
src/optimizer_interface.jl 96.64% <90.00%> (-0.38%) ⬇️
src/nlp_expr.jl 97.31% <97.31%> (-1.94%) ⬇️
src/macros.jl 97.98% <97.91%> (+<0.01%) ⬆️
src/aff_expr.jl 97.32% <100.00%> (ø)
src/complement.jl 100.00% <100.00%> (ø)
src/operators.jl 96.84% <100.00%> (ø)
src/quad_expr.jl 100.00% <100.00%> (ø)
src/variables.jl 98.97% <100.00%> (-0.04%) ⬇️

... and 2 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@odow
Copy link
Member

odow commented Aug 15, 2023

Some things need to be worked through:

  • How can solvers declare at the JuMP level what operations they support
  • How do we avoid type piracy without requiring JuMP to implement the full DCP atoms
  • How do we avoid overloading a method that could otherwise be traced

One thing to keep in mind is that we no longer parse Symbol. Every call is an overloaded Julia function. So we can't make up :quad_over_lin without making function quad_over_lin(x, y) end.

One option that @chriscoey and I discussed is that something like det(X) is not LinearAlgebra.det(X), but DCP.det(X), which creates NonlinearExpr(:dcp_det, vec(X)) or NonlinearExpr(:dcp_det, Any[X]), but is very clearly not supported by the AD systems.

@blegat
Copy link
Member Author

blegat commented Aug 16, 2023

How can solvers declare at the JuMP level what operations they support

Why would they need to do that ? The function we overload are going to fail anyway so we might as well add them in the expression tree and we'll let the solver throw an error if it does not support what it is getting.

How do we avoid type piracy without requiring JuMP to implement the full DCP atoms

DCP would be in MOI so we can do like:
https://github.com/jump-dev/JuMP.jl/pull/3106/files#diff-38284a9c301dbce5b78214aec3ec0172987aec8d7a85770c5a75c6f801fabee0R274

How do we avoid overloading a method that could otherwise be traced

What do you mean by "traced" ?

One thing to keep in mind is that we no longer parse Symbol. Every call is an overloaded Julia function. So we can't make up :quad_over_lin without making function quad_over_lin(x, y) end.

Yes, we would need to define functions. When they get numbers they return a number but if they get JuMP expressions, they return an NonlinearExpr.

One option that @chriscoey and I discussed is that something like det(X) is not LinearAlgebra.det(X), but DCP.det(X), which creates NonlinearExpr(:dcp_det, vec(X)) or NonlinearExpr(:dcp_det, Any[X]), but is very clearly not supported by the AD systems.

But then if someone calls LinearAlgebra.det, it won't work. Not sure I like that

@odow
Copy link
Member

odow commented Aug 16, 2023

I just worry that we're going to end in a situation where it's not obvious what functions can be used with which solver. I'd view DCP as a exploratory research project. It's not obvious that the first approach will be the right one.

@blegat
Copy link
Member Author

blegat commented Aug 16, 2023

I just worry that we're going to end in a situation where it's not obvious what functions can be used with which solver.

Still not sure what you mean. Could you give an example ? Is this a concern for a function that could be written in a more elementary form ? For example, norm would be rewritten into sqrt(prod(...)) while DCP would prefer to keep it as a norm ?

@odow
Copy link
Member

odow commented Aug 16, 2023

Why would they need to do that ?

How does the user find out what functions they can call for DCP?

DCP would be in MOI

I did not assume this. To begin with, it should be a JuMP extension or an MOI solver.

What do you mean by "traced" ?

Evaluating the function returns the symbolic expression. If f(x) = sin(x)^2, then calling f(x::VariableRef) traces the function.

But then if someone calls LinearAlgebra.det, it won't work. Not sure I like that

I think it's easier to keep all the DCP atoms in a strict namespace, rather than playing the game of overloading various methods from different places.

@odow
Copy link
Member

odow commented Aug 16, 2023

My takeaway is that this is a research question. Not a solved problem requiring an implementation.

@blegat
Copy link
Member Author

blegat commented Aug 16, 2023

How does the user find out what functions they can call for DCP?

They read the doc or get an error. That's the current status of JuMP's AD as well.

I did not assume this. To begin with, it should be a JuMP extension or an MOI solver.

Then type piracy to begin with

Evaluating the function returns the symbolic expression. If f(x) = sin(x)^2, then calling f(x::VariableRef) traces the function.

I guess norm is a good example then. Adding norm in the expression tree instead of expanding it would require JuMP's AD to support differentiating this operator too, which would probably be more efficient than differentiating the expanded form. I prefer adding few operators but make it work with both AD and DCP that starting to add all the atoms in the cvx world.

My takeaway is that this is a research question. Not a solved problem requiring an implementation.

This can start as a separate repo if we want to start having something easier to play around than an pair of open PRs in JuMP and MOI.

@odow
Copy link
Member

odow commented Aug 16, 2023

This can start as a separate repo if we want to start having something easier to play around than an pair of open PRs in JuMP and MOI.

Yes please.

@odow odow force-pushed the od/nlp-expr branch 2 times, most recently from 087674b to 1788385 Compare August 26, 2023 00:10
Base automatically changed from od/nlp-expr to master August 31, 2023 08:27
src/nlp_expr.jl Outdated
@@ -506,6 +510,8 @@ function moi_function(f::GenericNonlinearExpr{V}) where {V}
for i in length(f.args):-1:1
if f.args[i] isa GenericNonlinearExpr{V}
push!(stack, (ret, i, f.args[i]))
elseif arg.args[i] isa AbstractArray
child.args[i] = moi_function.(arg.args[i])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
child.args[i] = moi_function.(arg.args[i])
ret.args[i] = moi_function.(arg.args[i])

Copy link
Member

@odow odow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we instead just define

moi_function(x::AbstractArray) = moi_function.(x)
jump_function(x::AbstractArray) = jump_function.(x)

You could implement this as a pirated method in DCP.jl to begin with.

@blegat blegat marked this pull request as ready for review August 31, 2023 09:22
@blegat
Copy link
Member Author

blegat commented Aug 31, 2023

I updated the PR. The scope is not to overload norm, det, etc... but to allow NonlinearOperator to work with array inputs. At the moment, it would just call the function and not build the nonlinear expression, even if one of the arguments is an array of JuMP expressions.
If this is used with AD then the AD will error because an array is in the expression graph which is the expected behavior.

@odow
Copy link
Member

odow commented Aug 31, 2023

I'm not in favor of merging this until we have DCP.jl mocked up end-to-end. I'd like to know what the alternatives are.

One option is for DCP.jl to use a separate type entirely to begin with. They don't need to (mis)use NonlinearOperator.

Supporting vectors in nonlinear expressions is not something we should rush into, even if it's something we eventually want to support.

The only thing that might need implementing in JuMP is moi_function and jump_function, but these could be done with pirated methods to begin with.

@odow odow changed the title [POC] Allow array in nonlinear expressions DNMY: allow array in nonlinear expressions Aug 31, 2023
@blegat
Copy link
Member Author

blegat commented Aug 31, 2023

This requires changing how the NonlinearOperator act on arrays. If we consider this a bug-fix then it's fine to wait. Otherwise, if we consider it as a breaking change then we should decide now.

@odow
Copy link
Member

odow commented Sep 7, 2023

Closing in favor of #3489

@odow odow closed this Sep 7, 2023
@odow odow deleted the bl/array_nl branch September 7, 2023 21:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants