Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thoughts on a vector MOI or JuMP subset for NLP #1538

Closed
jlperla opened this issue Aug 19, 2021 · 1 comment
Closed

Thoughts on a vector MOI or JuMP subset for NLP #1538

jlperla opened this issue Aug 19, 2021 · 1 comment

Comments

@jlperla
Copy link

jlperla commented Aug 19, 2021

@frapac from our discussion:

Modeling languages are great, but for many nonlinear problems you just want to provide an objective and constraints to the solvers with minimal hassles. In economics, in particular, the equations might be impossible to even write in existing modeling langauges. In other cases, maybe you want to use all sorts of Symbolics.jl/ModelingTookit.jl trickery, but in the end have a function which just operates on component arrays.

Basically, you just want to call the optimizers with as native of primitives as possible and without any DSL getting in the way. This is in the same spirit as https://github.com/jump-dev/MatrixOptInterface.jl I believe.

In its current form, JuMP for the NLP is very scalar centric, and this was one of the things that led to the creation of https://github.com/SciML/GalacticOptim.jl (note, however, that the hope was that the scope of GalacticOptim will eventually expand into far more interesting things largely around applying problem reduction as well as other fancy AD things).

On the MOI side, there might not be that many things missing for a more direct pass-though. Probably need dense jacobians and hessians, and I think that anything that needs to allocate will be a problem if it wants to do passthrough to GPU-style algorithms, and it might make passthrough of things like componentarrays difficult. But that stuff could come later.

What might this look like to a user?

First, it is useful to have an object to collect derivatives and make different AD convenient. I will describe the ones from GalacticOptim, but the gradient wrappers are something that could be standardized and shared in SciMLBase or just made specific to JuMP/MOI. For example, https://github.com/SciML/GalacticOptim.jl/tree/master/src/function provides some wrappers for all sorts of AD. It could do something like the following.

Write your julia function. SciML allows allows passing in a parameter vector p which would be nice, but could be done with just a closure if that doesn't fit into JuMP very well.

rosenbrock(x, p) =  (p[1] - x[1])^2 + p[2] * (x[2] - x[1]^2)^2
f = OptimizationFunctions(rosenbrock)

Which would dispatch to the default auto-differentiation. Or to use zygote or finite differences

OptimizationFunction(rosenbrock, GalacticOptim.AutoZygote())
OptimizationFunction(rosenbrock, GalacticOptim.AutoFiniteDiff())

Or to pass in your own derviatives

rosenbrock_grad(x, p) =  2 p[1] * x[1]+  # WHATEVER IT IS
OptimizationFunction(rosenbrock, rosenbrock_grad)

Finally, if the problem was constrained, then in the current GalacticOptim (which could be changed) it can generate the jacobian and hessians as required. The current design has an optional cons but I think it would be much nicer if these were decoupled. Anyways, not essential either way.

rosenbrock(x, p) =  (p[1] - x[1])^2 + p[2] * (x[2] - x[1]^2)^2
function con2_c(x,p)
    [x[1]^2 + x[2]^2,
     x[2]*sin(x[1])-x[1]]
end
OptimizationFunction(rosenbrock, GalacticOptim.AutoForwardDiff();cons= con2_c)

First consider if there were MOI exnteions instead. @frapac suggested something like

model = MOI.nlpsolve(f, g!, h!, x_lb, x_ub, c_lb, c_ub
attach_linear_constraints!(model, A, b)
attach_quadratic_constraints!(model, Q, q, d)

which I would hope could have a "differentiation wrapper" version:

fun = OptimizationFunction(rosenbrock, GalacticOptim.AutoForwardDiff();cons= con2_c)
model = MOI.nlpsolve(fun, x_lb, x_ub, c_lb, c_ub)
attach_linear_constraints!(model, A, b)
attach_quadratic_constraints!(model, Q, q, d)

In fact, if this existed, then I think GalacticOptim could be extremly short and almost not exist. Or it could just be the OptimizationFunctions wrapper. But I would hope that we can still do all of the linear constraints, complementarity constraints, etc.

The last piece of GalacticOptim that might be worth considering is its role in auto-differentiation rules for the optimizeres themselves. We want to be able to establish AD rules for optimizers themselves (e.g. https://en.wikipedia.org/wiki/Envelope_theorem) but that might be better discussed down the line.

To be rrule friendly, we would probably want to avoid the mutatino of adding constraints ex-post, and instead have them in the constructor of the problem type. Mutation makes for difficult rules.


From the "DSL" perspective, I think that JuMP might actually work with a few constrained on supported features. Although I think I would preefer the MOI-extension instead.

In particular, if you want to provide a "vectorized" objective (and optionally nonlinear constraint) then:

  1. Only a single vectorized, variable is support.
  2. Only a single nonlinear constraint is supported.
  3. The user is responsible for providing the gradients/hessians. Maybe wrappers are provided to help, but this is left flexible.

Otherwise, I think that the existing JuMP and MOI stuff around linear constraints would be perfectly fine. The user would only have a single variables to work with, but I think that is fine.

Alternatively, if you wanted this to live in JuMP then, to see possible usage in a one-variable limited jump, let me adapt https://jump.dev/JuMP.jl/stable/tutorials/Nonlinear%20programs/rosenbrock/

using JuMP
import Ipopt
import Test
import AutoDiffWrappersFromWherever

function example_rosenbrock()
    model = Model(Ipopt.Optimizer)
    set_silent(model)

    # call wrappers discussed above
    rosenbrock(x) = (1 - x[1])^2 + 100 * (x[2] - x[1]^2)^2

    obj = OptimizationFunctions(rosenbrock, AutoFiniteDiff())


    @variable(model, x[1:2])  # or @passthrough_variable(model, x[1,2])
    @NLvector_objective(model, Min, obj)
    # ERRORS IF MODEL HAS MORE THAN ONE VARIABLE OR IF SCALAR

    optimize!(model)

    Test.@test termination_status(model) == MOI.LOCALLY_SOLVED
    Test.@test primal_status(model) == MOI.FEASIBLE_POINT
    Test.@test objective_value(model)  0.0 atol = 1e-10
    Test.@test value(x)  1.0
    Test.@test value(y)  1.0
    return
end

function example_constrained_rosenbrock()
    model = Model(Ipopt.Optimizer)
    set_silent(model)

    # call wrappers discussed above
    rosenbrock(x) = (1 - x[1])^2 + 100 * (x[2] - x[1]^2)^2
    function con2_c(x)
        [x[1]^2 + x[2]^2,
        x[2]*sin(x[1])-x[1]]
    end
    obj = OptimizationFunctions(rosenbrock, AutoFiniteDiff(), cons = con2_c)


    @variable(model, x[1:2])
    @NLvector_objective(model, Min, obj)

    # THIS ADDS THE CONSTRAINT AS WELL.  HESSIAN OF LAGRANGIAN ATTACHED

    optimize!(model)

    Test.@test termination_status(model) == MOI.LOCALLY_SOLVED
    Test.@test primal_status(model) == MOI.FEASIBLE_POINT
    Test.@test objective_value(model)  0.0 atol = 1e-10
    Test.@test value(x)  1.0
    Test.@test value(y)  1.0
    return
end

What would @NLvector_objective do? Basically just register the objective and potentially the constraints in their most raw form when providing the implementation for MOI calls. It can access the OptimizationFunctions structure for the function calls and just call them directly.

@odow
Copy link
Member

odow commented Aug 19, 2021

Can you post this as a comment in #846 rather than as this separate issue?

@odow odow added this to the v2.0 milestone Aug 19, 2021
@jlperla jlperla closed this as completed Aug 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

2 participants