Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GSoC: Test Function Suite #223

Closed
pkofod opened this issue Jun 14, 2016 · 28 comments
Closed

GSoC: Test Function Suite #223

pkofod opened this issue Jun 14, 2016 · 28 comments

Comments

@pkofod
Copy link
Member

pkofod commented Jun 14, 2016

I figured I should update details in separate issues instead of the one big issue #200 .

I am planning to start off by testing on CUTEst unconstrained functions, through the CUTEst.jl package. I installed it by following http://abelsiqueira.github.io/blog/installing-cutest-and-cutest.jl/ . The develop branch is necessary, as this has a grad! function defined already. The master branch also segfaulted on me quite often.

I had originally chosen the 216 unconstrained problems in CUTEst, but one seemed to have constraints, and CUTEst.jl had problems loading quite a few (I need to file an issue upstream), so now there are closer to 160.

An interesting feature of the test suite is how broadly we're testing in terms of dimensionality of x (call it N). It seems that a lot of functions are 2- or 3- dimensional, but also values of around 3000 and 5000 seem popular. 11 functions have N = 10000, and 2 have N = 20000. Some of the original formulations allow for a user defined N (like in the Rosenbrock case), but I'm not sure of CUTEst allows the user to specify this.
nvar_hist

Anyway, this is a much broader test bed than we already have, and I'll try to post some initial benchmarks as soon as possible.

@mlubin
Copy link
Contributor

mlubin commented Jun 14, 2016

CC @abelsiqueira @dpo

@dpo
Copy link

dpo commented Jun 14, 2016

Note that CUTEst.jl is now available from https://github.com/JuliaOptimizers/CUTEst.jl in more useable form. A lot of the CUTEst problems are also available in AMPL format from https://github.com/mpf/Optimization-Test-Problems and you can read them (once decoded) using https://github.com/JuliaOptimizers/AmplNLReader.jl. If you want to change the size of an AMPL model, you must edit the mod file and decode it again. If you want to see what sizes a CUTEst model comes in, you can do, e.g.,

sifdecoder -show LUBRIFC

and subsequently decode with

sifdecoder -param NN=50 LUBRIFC

CUTEst.jl doesn't currently support passing extra options to sifdecoder (I believe) but it would be easy to add that feature. You can select all unconstrained problems using the select tool (it's quite old school).
Don't hesitate if you encounter any trouble.

@pkofod
Copy link
Member Author

pkofod commented Jun 15, 2016

Thank you for your reply. It was indeed the JuliaOptimizers version of CUTEst I installed, I should have made that clear. I only used the blog-post to install the homebrew-stuff. Thanks for the other pointers!

@abelsiqueira
Copy link

Hi, thanks for using my post, but it isn't updated to the last version of the packages.
One thing that is almost released is the bindeps version, that automatically install CUTEst for linux and OSX.
We're working on some changes using NLPModels.jl, and I've been working on it depth-first, so I haven't touched CUTEst.jl in two months. Contact me if there is something you want to expedite.

Also, I'm interested in the load problems you had.

@pkofod
Copy link
Member Author

pkofod commented Jun 15, 2016

A bindeps version would be wonderful, as the whole homebrew thing was a bit, well, annoying :)

I would love to see what the following does on your installation: https://gist.github.com/pkofod/5dda61027b6829e91c40dccbe6f38f9b . On my computer, all of those problems throw an error, so works ends up being a vector of false.

@dpo
Copy link

dpo commented Jun 15, 2016

A bindeps version would be wonderful, as the whole homebrew thing was a bit, well, annoying :)

Did something go wrong?

I would love to see what the following does on your installation: https://gist.github.com/pkofod/5dda61027b6829e91c40dccbe6f38f9b . On my computer, all of those problems throw an error, so works ends up being a vector of false.

What is the error? Did you try from the command line?

@abelsiqueira
Copy link

A bindeps version would be wonderful,

It's started here. Haven't tested it in a while, but it was working.

as the whole homebrew thing was a bit, well, annoying :)

If you're on linux, I agree with you. Notice how bindeps uses linux-cutest, which is a simpler alternative, although very underdeveloped (no tests, no reinstall, etc.)

@abelsiqueira
Copy link

I would love to see what the following does on your installation: https://gist.github.com/pkofod/5dda61027b6829e91c40dccbe6f38f9b . On my computer, all of those problems throw an error, so works ends up being a vector of false.

Tried it, and they all throw an error, but that's because those problems are not on $MASTSIF. Where did you get them?
To get all unconstrained problems, you can use

grep " .U" $MASTSIF/CLASSF.DB | cut -f1 -d" "

@pkofod
Copy link
Member Author

pkofod commented Jun 16, 2016

Did something go wrong?

No, nothing went wrong, but as a linux user, homebrew is not really something I've ever used for anything before. It all worked out (after I found the blog post), but it would of course be preferable if Pkg.add("CUTEst") was all you needed :)

Tried it, and they all throw an error, but that's because those problems are not on $MASTSIF. Where did you get them?
To get all unconstrained problems, you can use

Okay, my bad then! The error was probably just that I didn't have them. I thought MASTSIF had all the CUTEst problems.Thanks for the grep.

@pkofod
Copy link
Member Author

pkofod commented Jun 18, 2016

So, the revised test set is 173 unconstrained problems with the following dimensions:

nvar   count
2      28
3      14
4      4
5      1
6      3
7      1
8      9
10     3
11     1
12     1
15     1
50     8
63     1
99     1
100    2
200    7
500    1
1000   4
1024   2
2000   1
2550   2
2600   2
2652   1
3000   16
3549   1
4000   2
4900   1
4999   1
5000   36
5010   1
5625   2
10000  11
20000  2
100000 2

Quite a few big problems, and maybe a bit too few in the range 10-50 (I suspect a lot of users might be interested in these. I'm running stuff, but it takes a while, that's why there's been no update :)

@pkofod
Copy link
Member Author

pkofod commented Jul 1, 2016

Just starting to get this under control. Ran BFGS with the 4 different line search algorithms we have. Below you see performance profiles for test problems up to (and including) dimension 5000. Seems like HZ (the default) is a pretty solid choice!
newplot

@dpo
Copy link

dpo commented Jul 2, 2016

What are you measuring in this profile? Number of objective evals?

Ps: https://github.com/JuliaSmoothOptimizers/Profiles.jl

@pkofod
Copy link
Member Author

pkofod commented Jul 2, 2016

No, this is simply for the attained minimum. Cool that you have a package for that in JSO. I just wrote it myself, as there was no license in the Matlab file on More and Wild's website. Here they are for f_calls
newplot 1

@dpo
Copy link

dpo commented Jul 2, 2016

That's instructive. The two plots together tell a more complete story. Armijo remains amazingly robust given how simple it is. I'll have to look at the details of your interpolating linesearch.

@pkofod
Copy link
Member Author

pkofod commented Jul 2, 2016

I would still say hz is dominating if we take into consideration the actual minimum they end up reporting. Edit: this was supposed to be just after the image not a response to you :)

@pkofod
Copy link
Member Author

pkofod commented Jul 2, 2016

I should compare these with LBFGS. thanks again for the CUTEst package :) the newest version is even easier to install !

@pkofod
Copy link
Member Author

pkofod commented Jul 2, 2016

newplot 2

This is a surprise... HZ linesearch isn't the best for HZ CG :)

@pkofod
Copy link
Member Author

pkofod commented Jul 2, 2016

I simply can't run the largest problems on my laptop, and even the medium-large sized ones for Nelder-Mead, so I am waiting for the sysadmin to fix a symlink-thing on the server at the econ department. However, I'm starting to gather some evidence at least.
newplot 3

@pkofod
Copy link
Member Author

pkofod commented Jul 2, 2016

I'm just spamming at this point. Any participants are welcome to unsubscribe if they're annoyed :)

newplot 4

This is actually quite surprising to me! (I took the upper tau to 20 for clarity, the picture remains the same with other taus bars). Conjugate Gradient is doing remarkably bad compared to Gradient descent here! I've cherry-picked the best line search algorithms for each "solver" or overall algorithm.

Edit Hm, I think the between solver results are wrong, but within each solver it should be correct. Will be back with an update :)

This should be the correct one:
newplot 5

@Evizero
Copy link
Contributor

Evizero commented Jul 3, 2016

I didn't really follow along here, but just scrolling over the images made me curious. Are these for the Julia implementation in this package? if so what library did you use for plotting?

@pkofod
Copy link
Member Author

pkofod commented Jul 3, 2016

Yes they are. Plots.jl with the Plotly backend

@timholy
Copy link
Contributor

timholy commented Jul 3, 2016

Almost surely a bug in CG, perhaps in the convergence criteria? You already noticed one typo, there may be others!

This is what's so wonderful about having a good test suite! 💯

@pkofod
Copy link
Member Author

pkofod commented Jul 3, 2016

the dominating gradient descent was definitely an error on my part, but now I can't see the cgs in the plot. In a car, will check later. I agree that the test suite is going to be very helpful. The current test set is 155 functions, and it excludes the largest ones. Not a bad start I think.

@pkofod
Copy link
Member Author

pkofod commented Jul 3, 2016

newplot 6
Now CG is actually in the plot, and doing much better than gradient descent which has 4 entries in the bottom 7 - auch :) BFGS still seems to dominate, and the BFGS+HZ solver seems to be a safe default (as it is).

(still waiting for the sysadmin to make a symlink to gfortran so I can install CUTEst at work...).

edit however mt is still better than hz for cg...

edit^2 one line was missing: the lbfgs hz solver!
newplot 7

I'll publish the code for this shortly, so others can find even more bugs... :)

@mlubin
Copy link
Contributor

mlubin commented Jul 4, 2016

@pkofod, where is the code you're using to run these experiments?
EDIT: not sure if you mean that you'll publish the performance profile code or all of it together shortly

@pkofod
Copy link
Member Author

pkofod commented Jul 4, 2016

@pkofod, where is the code you're using to run these experiments?
EDIT: not sure if you mean that you'll publish the performance profile code or all of it together shortly

I just mean that I'll post the code I used to produce these figures. It's only available if you're setting next to me right now. :) I'll post a gist or something. I just need to organize it a bit better.

@pkofod
Copy link
Member Author

pkofod commented Jul 4, 2016

As an example, this code produces the figures posted in #220 , that I repeat here:
newplot 10

Do produce that figure, I first load the module, Plots, and Optim. Then I construct a vector of names, and a vector of Optimizer specifications to be benchmarked

using Plots, OptimBenchmarks, Optim

names = ["NelderMead with fixed parameters and unit step",
         "NelderMead with fixed parameters and relative_step (0.05)",
         "NelderMead with fixed parameters and relative step (0.1)",
         "NelderMead with fixed parameters and relative step (0.50)",
         "NelderMead with adaptive parameters and relative step (0.05)",
         "NelderMead with adaptive parameters and relative step (0.1)",
         "NelderMead with adaptive parameters and relative step (0.50)"]

opt_cases =[Optim.NelderMead(parameters = Optim.fixed_parameters,    initial_step = Optim.unit_step),
            Optim.NelderMead(parameters = Optim.fixed_parameters,    initial_step = x->Optim.relative_step(x; s = 0.05)),
            Optim.NelderMead(parameters = Optim.fixed_parameters,    initial_step = x->Optim.relative_step(x; s = 0.1)),
            Optim.NelderMead(parameters = Optim.fixed_parameters,    initial_step = x->Optim.relative_step(x; s = 0.5)),
            Optim.NelderMead(parameters = Optim.adaptive_parameters, initial_step = x->Optim.relative_step(x; s = 0.05)),
            Optim.NelderMead(parameters = Optim.adaptive_parameters, initial_step = x->Optim.relative_step(x; s = 0.1)),
            Optim.NelderMead(parameters = Optim.adaptive_parameters, initial_step = x->Optim.relative_step(x; s = 0.5))]

then I run the benchmarks. It uses try/catches, as an out of memory error or line search error would otherwise ruin everything. We should make line searches soft quit imo.

out = benchmark(names, opt_cases, Optim.OptimizationOptions(iterations = 20000); max_dim = 1000)

save("nelder_mead_benchmark.csv", out...)

Then, load the data as a dataframe (I just saved it to keep a copy), and plot it

dff = read_profile_data("nelder_mead_benchmark.csv")

tau = 1:0.5:20
profiles = performance_profile(names, dff, tau)

plot(tau, hcat(profiles...),
          label = hcat(names...),
          ls=[:solid  :dash :dash :dash :dot :dot :dot],
          size =(800,500),
          ylims = (0,1))

the module is in https://gist.github.com/pkofod/82ba6021e73cf9ee2289f35b08a09f2e (some things could be done more cleanly...)

(note to self, I should wait with the call to aggregate until the performance_profile call)

@pkofod pkofod mentioned this issue Aug 12, 2016
@pkofod
Copy link
Member Author

pkofod commented Jan 3, 2017

This work is now in OptimTests, thanks for the feedback.

@pkofod pkofod closed this as completed Jan 3, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants