Thoughts on a vector MOI or JuMP subset for NLP #1538

jlperla · 2021-08-19T19:05:03Z

@frapac from our discussion:

Modeling languages are great, but for many nonlinear problems you just want to provide an objective and constraints to the solvers with minimal hassles. In economics, in particular, the equations might be impossible to even write in existing modeling langauges. In other cases, maybe you want to use all sorts of Symbolics.jl/ModelingTookit.jl trickery, but in the end have a function which just operates on component arrays.

Basically, you just want to call the optimizers with as native of primitives as possible and without any DSL getting in the way. This is in the same spirit as https://github.com/jump-dev/MatrixOptInterface.jl I believe.

In its current form, JuMP for the NLP is very scalar centric, and this was one of the things that led to the creation of https://github.com/SciML/GalacticOptim.jl (note, however, that the hope was that the scope of GalacticOptim will eventually expand into far more interesting things largely around applying problem reduction as well as other fancy AD things).

On the MOI side, there might not be that many things missing for a more direct pass-though. Probably need dense jacobians and hessians, and I think that anything that needs to allocate will be a problem if it wants to do passthrough to GPU-style algorithms, and it might make passthrough of things like componentarrays difficult. But that stuff could come later.

What might this look like to a user?

First, it is useful to have an object to collect derivatives and make different AD convenient. I will describe the ones from GalacticOptim, but the gradient wrappers are something that could be standardized and shared in SciMLBase or just made specific to JuMP/MOI. For example, https://github.com/SciML/GalacticOptim.jl/tree/master/src/function provides some wrappers for all sorts of AD. It could do something like the following.

Write your julia function. SciML allows allows passing in a parameter vector p which would be nice, but could be done with just a closure if that doesn't fit into JuMP very well.

rosenbrock(x, p) =  (p[1] - x[1])^2 + p[2] * (x[2] - x[1]^2)^2
f = OptimizationFunctions(rosenbrock)

Which would dispatch to the default auto-differentiation. Or to use zygote or finite differences

OptimizationFunction(rosenbrock, GalacticOptim.AutoZygote())
OptimizationFunction(rosenbrock, GalacticOptim.AutoFiniteDiff())

Or to pass in your own derviatives

rosenbrock_grad(x, p) =  2 p[1] * x[1]+  # WHATEVER IT IS
OptimizationFunction(rosenbrock, rosenbrock_grad)

Finally, if the problem was constrained, then in the current GalacticOptim (which could be changed) it can generate the jacobian and hessians as required. The current design has an optional cons but I think it would be much nicer if these were decoupled. Anyways, not essential either way.

rosenbrock(x, p) =  (p[1] - x[1])^2 + p[2] * (x[2] - x[1]^2)^2
function con2_c(x,p)
    [x[1]^2 + x[2]^2,
     x[2]*sin(x[1])-x[1]]
end
OptimizationFunction(rosenbrock, GalacticOptim.AutoForwardDiff();cons= con2_c)

First consider if there were MOI exnteions instead. @frapac suggested something like

model = MOI.nlpsolve(f, g!, h!, x_lb, x_ub, c_lb, c_ub
attach_linear_constraints!(model, A, b)
attach_quadratic_constraints!(model, Q, q, d)

which I would hope could have a "differentiation wrapper" version:

fun = OptimizationFunction(rosenbrock, GalacticOptim.AutoForwardDiff();cons= con2_c)
model = MOI.nlpsolve(fun, x_lb, x_ub, c_lb, c_ub)
attach_linear_constraints!(model, A, b)
attach_quadratic_constraints!(model, Q, q, d)

In fact, if this existed, then I think GalacticOptim could be extremly short and almost not exist. Or it could just be the OptimizationFunctions wrapper. But I would hope that we can still do all of the linear constraints, complementarity constraints, etc.

The last piece of GalacticOptim that might be worth considering is its role in auto-differentiation rules for the optimizeres themselves. We want to be able to establish AD rules for optimizers themselves (e.g. https://en.wikipedia.org/wiki/Envelope_theorem) but that might be better discussed down the line.

To be rrule friendly, we would probably want to avoid the mutatino of adding constraints ex-post, and instead have them in the constructor of the problem type. Mutation makes for difficult rules.

From the "DSL" perspective, I think that JuMP might actually work with a few constrained on supported features. Although I think I would preefer the MOI-extension instead.

In particular, if you want to provide a "vectorized" objective (and optionally nonlinear constraint) then:

Only a single vectorized, variable is support.
Only a single nonlinear constraint is supported.
The user is responsible for providing the gradients/hessians. Maybe wrappers are provided to help, but this is left flexible.

Otherwise, I think that the existing JuMP and MOI stuff around linear constraints would be perfectly fine. The user would only have a single variables to work with, but I think that is fine.

Alternatively, if you wanted this to live in JuMP then, to see possible usage in a one-variable limited jump, let me adapt https://jump.dev/JuMP.jl/stable/tutorials/Nonlinear%20programs/rosenbrock/

using JuMP
import Ipopt
import Test
import AutoDiffWrappersFromWherever

function example_rosenbrock()
    model = Model(Ipopt.Optimizer)
    set_silent(model)

    # call wrappers discussed above
    rosenbrock(x) = (1 - x[1])^2 + 100 * (x[2] - x[1]^2)^2

    obj = OptimizationFunctions(rosenbrock, AutoFiniteDiff())


    @variable(model, x[1:2])  # or @passthrough_variable(model, x[1,2])
    @NLvector_objective(model, Min, obj)
    # ERRORS IF MODEL HAS MORE THAN ONE VARIABLE OR IF SCALAR

    optimize!(model)

    Test.@test termination_status(model) == MOI.LOCALLY_SOLVED
    Test.@test primal_status(model) == MOI.FEASIBLE_POINT
    Test.@test objective_value(model) ≈ 0.0 atol = 1e-10
    Test.@test value(x) ≈ 1.0
    Test.@test value(y) ≈ 1.0
    return
end

function example_constrained_rosenbrock()
    model = Model(Ipopt.Optimizer)
    set_silent(model)

    # call wrappers discussed above
    rosenbrock(x) = (1 - x[1])^2 + 100 * (x[2] - x[1]^2)^2
    function con2_c(x)
        [x[1]^2 + x[2]^2,
        x[2]*sin(x[1])-x[1]]
    end
    obj = OptimizationFunctions(rosenbrock, AutoFiniteDiff(), cons = con2_c)


    @variable(model, x[1:2])
    @NLvector_objective(model, Min, obj)

    # THIS ADDS THE CONSTRAINT AS WELL.  HESSIAN OF LAGRANGIAN ATTACHED

    optimize!(model)

    Test.@test termination_status(model) == MOI.LOCALLY_SOLVED
    Test.@test primal_status(model) == MOI.FEASIBLE_POINT
    Test.@test objective_value(model) ≈ 0.0 atol = 1e-10
    Test.@test value(x) ≈ 1.0
    Test.@test value(y) ≈ 1.0
    return
end

What would @NLvector_objective do? Basically just register the objective and potentially the constraints in their most raw form when providing the implementation for MOI calls. It can access the OptimizationFunctions structure for the function calls and just call them directly.

The text was updated successfully, but these errors were encountered:

odow · 2021-08-19T20:06:14Z

Can you post this as a comment in #846 rather than as this separate issue?

odow added this to the v2.0 milestone Aug 19, 2021

odow added the Type: Enhancement label Aug 19, 2021

jlperla closed this as completed Aug 19, 2021

odow mentioned this issue Aug 19, 2021

Next-generation nonlinear #846

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Thoughts on a vector MOI or JuMP subset for NLP #1538

Thoughts on a vector MOI or JuMP subset for NLP #1538

jlperla commented Aug 19, 2021 •

edited

Loading

odow commented Aug 19, 2021

Thoughts on a vector MOI or JuMP subset for NLP #1538

Thoughts on a vector MOI or JuMP subset for NLP #1538

Comments

jlperla commented Aug 19, 2021 • edited Loading

odow commented Aug 19, 2021

jlperla commented Aug 19, 2021 •

edited

Loading