Introduction

This is a short comparison of the mathematical optimization facilities of the Julia language, where I compare JuMP.jl, Optim.jl, and Optimization.jl libraries.

using JuMP
using Optim
using Optimization
using OptimizationOptimJL
using OptimizationNLopt
using BenchmarkTools

import Ipopt
import NLopt


# Booth function. The three frameworks require different specifications.
booth(x1, x2)  = (x1 + 2x2 - 7)^2 + (2x1 + x2 -5)^2 
booth_vector(x)  = (x[1] + 2x[2] - 7)^2 + (2x[1] + x[2] -5)^2 
booth_parameters(x, p)  = (x[1] + 2x[2] - 7)^2 + (2x[1] + x[2] -5)^2;

JuMP.jl Implementation

model = Model()
set_silent(model)

@variable(model, x[1:2])

register(model, :booth, 2, booth; autodiff = true)

@NLobjective(model, Min, booth(x[1], x[2]))

Ipopt.jl

set_optimizer(model, Ipopt.Optimizer)
@benchmark JuMP.optimize!($model)
BenchmarkTools.Trial: 592 samples with 1 evaluation.
 Range (min … max):  7.671 ms … 14.631 ms  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     7.947 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   8.431 ms ±  1.074 ms  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▃█▇▅▂▃▃▂▂  ▁▁  ▂                                            
  ██████████████▅██▆▆▁▆▅▆▆▅▆▆▅▆▇▅▆▁▆▄▅▄▅▄▆▆▄▄▄▅▁▅▄▁▅▁▁▄▁▁▄▁▄ ▇
  7.67 ms      Histogram: log(frequency) by time     12.9 ms <

 Memory estimate: 20.06 KiB, allocs estimate: 442.

Ipopt really is not a good substitute for the native Julia implementation of Optim.jl. Nevertheless, the same algorithms implemented by Optim.jl can be found in the NLopt.jl package as bindings to implementations in other languages.

NLopt.jl

set_optimizer(model, NLopt.Optimizer)
set_optimizer_attribute(model, "algorithm", :LD_LBFGS)
@benchmark JuMP.optimize!($model)
BenchmarkTools.Trial: 9546 samples with 1 evaluation.
 Range (min … max):  463.448 μs …  17.432 ms  ┊ GC (min … max): 0.00% … 77.
15%
 Time  (median):     480.015 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   510.563 μs ± 190.779 μs  ┊ GC (mean ± σ):  0.28% ±  0.
79%

  ▅█▆▅▄▅▄▃▃▂▁▁   ▁▁▁▁▁▁                                         ▁
  ██████████████████████▇▇▅▆▆▅▇▄▆▆▆▆▆▅▇▆▅▆▆▆▆▆▇▆▆▆▇▇▆▅▆▆▄▄▆▄▅▄▅ █
  463 μs        Histogram: log(frequency) by time        862 μs <

 Memory estimate: 12.28 KiB, allocs estimate: 244.

Optim.jl

@benchmark Optim.optimize($booth_vector, [0., 0.], LBFGS(); autodiff = :forward)
BenchmarkTools.Trial: 10000 samples with 4 evaluations.
 Range (min … max):  7.130 μs …  2.177 ms  ┊ GC (min … max): 0.00% … 99.13%
 Time  (median):     7.604 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   8.887 μs ± 42.817 μs  ┊ GC (mean ± σ):  9.58% ±  1.98%

   ▃▆███▇▆▅▄▄▃▃▂▂▁▁▁▁▁    ▁▁  ▁▁▁▁▁▂▂▁▂▁▂▁▁▁                 ▂
  ▅████████████████████▇███████████████████████▆▇▆▆▅▄▅▄▅▅▅▂▅ █
  7.13 μs      Histogram: log(frequency) by time     12.1 μs <

 Memory estimate: 8.53 KiB, allocs estimate: 132.

There is an interesting pull request to implement an Optim.jl interface for JuMP here. It would be interesting to compare the benchmarks once Optim becomes accessible from JuMP. For now, the almost 100 times slower speed may be attributed to either slower implementation in NLopt or the huge overhead of JuMP modeling. Alternatively, we can use another wrapper over optimization libraries, the Optimization.jl library.

Optimization.jl

optf = OptimizationFunction(booth_parameters, Optimization.AutoForwardDiff())
prob = OptimizationProblem(optf, [0., 0.])
@benchmark solve($prob, LBFGS())
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  16.511 μs …   8.585 ms  ┊ GC (min … max): 0.00% … 99.2
1%
 Time  (median):     17.989 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   21.041 μs ± 120.161 μs  ┊ GC (mean ± σ):  8.04% ±  1.4
1%

    ▇█▂                                                         
  ▃▇███▆▄▃▄▅▅▄▃▄▄▄▃▃▃▃▃▃▂▂▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▂▂▁▂▂▂▂▁▂ ▃
  16.5 μs         Histogram: frequency by time         35.7 μs <

 Memory estimate: 14.94 KiB, allocs estimate: 211.

There is clearly an overhead to using Optimization.jl making it more than two times slower than using Optim.jl natively.

NLopt

@benchmark solve($prob, NLopt.LD_LBFGS())
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  413.016 μs …  1.420 ms  ┊ GC (min … max): 0.00% … 0.00
%
 Time  (median):     431.895 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   451.550 μs ± 48.815 μs  ┊ GC (mean ± σ):  0.00% ± 0.00
%

   ▄█                                                           
  ▁██▃▇▆▃▃▃▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
  413 μs          Histogram: frequency by time          633 μs <

 Memory estimate: 3.58 KiB, allocs estimate: 64.

Comparing the result with JuMP, one can conclude that the overheads in JuMP and Optimization.jl seem to be on the same level. The poorer benchmark results can therefore be attributed to NLopt.jl or the packages it wraps.

Another great thing about Optimization.jl is that it interfaces with the ModelingToolkit.jl package pretty well as well.

Which Framework to Choose

It is true that the Optim.jl may not really be a framework per se. Nevertheless, its raw speed makes it a great choice for embedding it in analyses where optimization may be the bottleneck. Such as calling an optimization routine in a long loop or matching and estimation.

On the other hand, a framework such as Optimization.jl, despite the added overhead, provides great convenience especially in situations where the function to be optimized is subject to rapid changes (such as testing modeling approaches), allowing one to quickly switch between different optimization methods with easy syntax.

Personally, I think JuMP is best for a single optimization problem, perhaps large-scaled, with many underlying considerations, such as done in the PowerModels.jl package. It is just not as easy to prototype a model using JuMP because of the sort of global approach it takes to modeling (a single model that is to be modified using macros). I see the Optimization.jl package as the framework with the greatest flexibility whose syntax does not deviate greatly from Julia Base, despite remaining highly extensible.