Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build function efficient multiple returns #46

Closed
raphaelchinchilla opened this issue Nov 6, 2020 · 11 comments
Closed

Build function efficient multiple returns #46

raphaelchinchilla opened this issue Nov 6, 2020 · 11 comments

Comments

@raphaelchinchilla
Copy link
Contributor

raphaelchinchilla commented Nov 6, 2020

It is often the case (for instance in optimization) that one needs to calculate some functions, such as the gradient and the hessian, that have some shared calculations. Is there a way in modelingtoolkit to use symbolic computation and build_function to generate functions that act something like?

function gradhess!(grad,hess)

aux=shared calculations

if grad != nothing

  grad.=compute_grad (aux)

end

if hessian != nothing

  hess.= compute_hess(aux)

end

end

┆Issue is synchronized with this Trello card by Unito

@ChrisRackauckas
Copy link
Member

If you make an array of array of expressions it'll output that array of array. It think we might want to try and let a tuple of expressions do that. @shashi what do you think on this one?

I think we'd just generate the full expressions and let CSE reduce it.

@raphaelchinchilla
Copy link
Contributor Author

raphaelchinchilla commented Nov 6, 2020

Thanks Chris,

But just returning an array of array would be enough to avoid repeating computation and to only compute, for instance the gradient if the hessian is not needed?

To be clear, I want to know whether there is way to symbolically and automatically generate something like what they mention in the Optim.jl documentation:

function fg!(F,G,x)
  # do common computations here
  # ...
  if G != nothing
    # code to compute gradient here
    # writing the result to the vector G
  end
  if F != nothing
    # value = ... code to compute objective function
    return value
  end
end

@ChrisRackauckas
Copy link
Member

Oh a doubly-mutating function? Yeah, you could make a function _fg!(Y,x) where Y = [F,G] and then mutate that array of arrays.

@raphaelchinchilla
Copy link
Contributor Author

raphaelchinchilla commented Dec 3, 2020

Hi Chris,

Sorry for taking so long to answer back, I wanted to have a MWE that reflected my questions. The matter is not exactly to have a doubly-mutating function. Consider the following example inspired by the one in the documentation of DiffResults.jl

using ModelingToolkit
@variables x[1:2]
f = prod(tan, x) * sum(sqrt, x)
g=ModelingToolkit.gradient(f,x)

One can see in the result below that many operations used in f can be reused in g:

f=((tan(x₁) * tan(x₂)) * (sqrt(x₁) + sqrt(x₂)))

and 

g=[(tan(x₂) * (sec(x₁) ^ 2) * (sqrt(x₁) + sqrt(x₂))) + (0.5 * tan(x₁) * tan(x₂) * (sqrt(x₁) ^ -1));
    (tan(x₁) * (sqrt(x₁) + sqrt(x₂)) * (sec(x₂) ^ 2)) + (0.5 * tan(x₁) * tan(x₂) * (sqrt(x₂) ^ -1))]

Suppose one is running a gradient descent algorithm with a line search. Given an input vector x one will need functions to
i) compute only f
ii) compute only g
ii) compute both f and g

Question 1: Is there a better way to do this than to do (independent of inplace or not)

builted_f=build_function(f,x)[1]
builted_g=build_function(g,x)[1]
builted_fg=build_function([f;g],x)[1]

Question 2: It seems to me when I run for instance

julia_fg=eval(builted_fg)
@code_llvm julia_fg([1.,2.])

that LLVM does not realize that there are multiple operations that could be reused. I might be wrong about that because I am not very familiar with LLVM. If indeed LLVM does not reuse the operations, is there a functionality that could be implemented on build_function itself that would allow for that?

Question 3: (This is actually probably a bug) When building the function builted_fg using [f;g] it returns a vector of size 3, which is not a problem for a simple example but for more complex example, keeping track of the index might be challenging. If I try to run build_function([(f,g),x)[1] I receive an error. If I try build_function([f , g],x)[1] (i.e. using " , " instead of " ; ") I do not get an error, but if I go on to run

julia_fg=eval(builted_fg)
julia_fg([1.,2.])

the output is

2-element Array{Any,1}:
 -8.21556383191817
   Expr[:((+)((*)((tan)(x₂), (^)((sec)(x₁), 2), (+)((sqrt)(x₁), (sqrt)(x₂))), (*)(0.5, (tan)(x₁), (tan)(x₂), (^)((inv)((sqrt)(x₁)), 1)))), :((+)((*)((tan)(x₁), (+)((sqrt)(x₁), (sqrt)(x₂)), (^)((sec)(x₂), 2)), (*)(0.5, (tan)(x₁), (tan)(x₂), (^)((inv)((sqrt)(x₂)), 1))))]

which seems like a bug, or I might be doing something wrong.

@ChrisRackauckas
Copy link
Member

For Q1, would could in theory allow thunks.

For Q2, I would've thought LLVM would CSE it. @shashi @YingboMa comments on that?

For Q3, yes tuple outputs would solve this but it's not implemented yet.

@shashi
Copy link
Member

shashi commented Dec 5, 2020

We can do Q2 symbolically instead of relying on LLVM.

@YingboMa
Copy link
Member

YingboMa commented Dec 6, 2020

Also, doing CSE is essential for producing readable code after reduction, too.

@ChrisRackauckas ChrisRackauckas transferred this issue from SciML/ModelingToolkit.jl Feb 26, 2021
@raphaelchinchilla
Copy link
Contributor Author

For Q3, yes tuple outputs would solve this but it's not implemented yet.

I was looking into how to implement tuple returns in build_function. I discovered that if we include the dispatch

function toexpr(O::Tuple, st)
    :(($(toexpr.(O, (st,))...),))
end

in SymbolicUtils.Code, the functionality works.

I was about to do a PR to SymbolicUtils including this dispatch and a test. However, in the last minute, I realized that maybe this is the whole point of the type SymbolicUtils.Code.MakeTuple. Can someone confirm this to me? In this case I will figure out what needs to be changed in build_function to make this work.

@shashi
Copy link
Member

shashi commented Aug 20, 2023

Yeah I think I didn't really have toexpr on Tuple in mind, Just like toexpr is not defined on a vector. toexpr is defined on term-like types. We also have MakeArray, and SetArray. I'm more conservative about methods because it changes the usefulness of toexpr.

@shashi
Copy link
Member

shashi commented Aug 20, 2023

You can definitely define something like

_toexpr(x) = toexpr(x); _toexpr(x::Tuple) = MakeTuple(x) and use that in build_function.

@raphaelchinchilla
Copy link
Contributor Author

Yeah, that could have been an idea. Why did you decide to merge the PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants