API: Minimizer #1

jonas-eschle · 2020-11-28T17:32:19Z

Discussion for an optimal minimizer interface to be used for likelihood (-like) minimization tasks.

Questions

Statefullness

Should the minimizer be stateful or not or both (e.g. within a context)? A stateful minimizer has the advantage of enabling a simple way of fine-grained minimization control which alternatively has to be done via arguments to the minimize function. However, it requires to create a new minimizer for every minimization and the minimization procedure cannot be simply replicated (while if it's stateless, it's always a simple call to minimize).

API

`minimize(loss, params)`

minimize the loss fully using the given params. Return a FitResult.

`minimize_step`

minimize a single step, can be not implemented.

Returned result

What is the best result to be returned? a FitResult, but maybe just a minimal API is fine?

The text was updated successfully, but these errors were encountered:

HDembinski · 2020-12-09T14:34:40Z

I have a fundamental question: I am not sure what this should accomplish? I don't think zfit has the influence to establish an interface that is then adopted by independent much larger projects such as libnlopt or scipy.optimize. For iminuit, I already provide a standard interface (scipy.optimize.minimize). If you want to establish standards, I think you need to do that from within the scipy community. They already put a lot of thought into the scipy.optimize.minimize interface, so perhaps any proposal should build on that interface?

There is more to consider than what you write here. What about setting limits and constraints?

Regarding "statefulness or not": stateless functions are easier to reason about and they can be trivially parallelized. I would generally prefer that, but the stateful approach of Minuit also has its merits. This design does not allow me to start the minimization with extra information that I have already acquired in previous calls (I cannot pass a previously calculated Hessian matrix, for example, and there is no way for the minimizer to know that I am already close to the minimum). Minuit was designed to pass a maximum of internal information from one method to another. This idea is deeply ingrained into Minuit, but not in scipy.optimize. That's ok, because scipy.optimize is only about minimization, while Minuit also computes errors. It does extra things and so another design makes sense.

jonas-eschle · 2020-12-09T17:04:37Z

I overall agree with your points. Thanks a lot for the inputs! Let's be more specific:

The scope: It's true that zfit won't have that influence and it ain't that goal. Indeed, the goal is rather to converge to a standard in HEP before many people build their own optimizers interface (as they do unfortunately...). So better get a good standard within the niche of HEP than 10+ all over the place.
While scipys interface is nice, it surely lacks the statefulness (if needed) and has a purely functional approach; the interface adopted by e.g. PyTorch or TensorFlows optimizer of creating first a minimizer (with the config) and then using a minimize function (like iminuit as well) seems a better API.

The question on the API is not standalone, it also ask for a good integration with parameters and objective functions as well as the returned fit-result. Here are the main thoughts why something "new" is needed:

iminuit also calculates the errors. However, in a more independent workflow such as proposed with the zfit interface, this things should be separated. Hence, an ideal interface does not need any error calculation and puts up the question whether statefulness is needed then. It still can be to e.g. do multiple steps, but in practice I think a single minimize function may be fine
scipy, as mentioned, has an "overloaded" (IMHO) functional API. Good, but I think the iminuit, TF etc approach of creating an instance seems nicer. This also allows to instantiate a minimizer however it should be (e.g. with whatever parameters) but to have a common minimize interface.
TF and PyTorch have nice APIs, however, their goal is not to find the global minimum and they have a step-based minimization method.

So the idea is: what is the optimal interface for "complicated" minimizations as used in HEP? If we can converge, great! If not, well, then not. But we should at least try and collect the inputs. The more we converge (even with names etc) the better. And additionally adapting to a standard interface such as scipy is oc always a possibility.

About limits and constraints: that's indeed a point and it may be well suited to pass this alongside the minimize method?

As said, it ain't about to have the perfect API. But to at least have had this discussion once and see what we can actually share.

HDembinski · 2020-12-10T10:37:02Z

I still don't see how you can converge to "a standard in HEP". Do you think you can change the scipy interface? Or libnlopt?

I think the best you can do is write an isolated wrapper library which does the adjustment and then share that between projects.

While scipys interface is nice, it surely lacks the statefulness (if needed) and has a purely functional approach; the interface adopted by e.g. PyTorch or TensorFlows optimizer of creating first a minimizer (with the config) and then using a minimize function (like iminuit as well) seems a better API.

A stateless function can have the same functionality as a stateful function, it is not a drawback in principle. It is a matter of initialization. The stateful object passes information implicitly. A stateless function must pass information explicitly (that's also why they are easier to reason about, everything is transparent and open). You can always go for stateless, if you make the initialization rich enough to pass all required info.

iminuit also calculates the errors. However, in a more independent workflow such as proposed with the zfit interface, this things should be separated. Hence, an ideal interface does not need any error calculation and puts up the question whether statefulness is needed then.

Computing errors as part of the minimization is one of the synergies achieved in Minuit. This is a good thing. Minuit has to compute an accurate Hessian anyway for its stopping criterion. It then also uses the output of that calculation to give you the errors. When running Minos, it also uses the previously computed Hessian matrix to find the intervals faster. Everything is deeply connected in Minuit and synergies are achieved this way.

All this makes sense when you actually care about computing the errors. You cannot easily separate error computation from minimization, because there are synergies between the two steps. You have to think about these two steps together when you come up with an API. I am not saying that the API must be that of Minuit, but I strongly disagree with your statement that we can define a minimizer API in isolation and then worry about a separate error computation API. These two things need to be thought of together or you will regret it later.

scipy, as mentioned, has an "overloaded" (IMHO) functional API.

Scipy is complex because it offers a lot of knobs and tweaks and a lot of functionality. It is not overloaded IMHO. Try to do the same as scipy does with less API and you will see. I don't think you can simplify the API a lot without also cutting away functionality.

TF and PyTorch have nice APIs, however, their goal is not to find the global minimum and they have a step-based minimization method.

I don't know these, but they probably do less. Do they allow for linear and non-linear constraints, for example?

jonas-eschle · 2020-12-10T18:17:30Z

I still don't see how you can converge to "a standard in HEP". Do you think you can change the scipy interface? Or libnlopt?

No, not at all. But this is not in our interest as scipy and nlopt are not HEP focused libraries. When talking about a standard, I don't mean this large projects. I mean the dozens of projects that do something along the line of minimization and invent their own optimizer (e.g. flamedisx, tensorwave etc.). The idea is to provide this standard so that

optimizer are interchangeable and need to be wrapped only once
compatibility also with other components (it's the same for the cost function)

I think the best you can do is write an isolated wrapper library which does the adjustment and then share that between projects.

I fully agree! This is the goal with zfit and why we care so much about the API: because we want to distribute a "possible standard" and that API should be nice. Now one thing is to have wrappers, another is a certain convergence in small things, e.g. on keywords used in iminuit etc.

The less we need to wrap, the better. And the whole discussion is about that: what's the most useful API for our usecases? (Maybe the answer is scipy or nlopt, that's great! I am very happy whoever sticks to these interfaces instead of reinventing the wheel or offers an interface, such as in iminuit)

Now of course, iminuit is different in its scope compared to scipy and nlopt, so a different interface makes sense.

A stateless function can have the same functionality as a stateful function, it is not a drawback in principle. It is a matter of initialization. The stateful object passes information implicitly. A stateless function must pass information explicitly (that's also why they are easier to reason about, everything is transparent and open). You can always go for stateless, if you make the initialization rich enough to pass all required info.

While true theoretically, it requires to introduce some kind of function that allows to actually execute the minimization. Not impossible to do, sure.

Computing errors as part of the minimization is one of the synergies achieved in Minuit. This is a good thing. Minuit has to compute an accurate Hessian anyway for its stopping criterion. It then also uses the output of that calculation to give you the errors. When running Minos, it also uses the previously computed Hessian matrix to find the intervals faster. Everything is deeply connected in Minuit and synergies are achieved this way.

All this makes sense when you actually care about computing the errors. You cannot easily separate error computation from minimization, because there are synergies between the two steps.

These are two different things: that you cannot separate vs that there are synergies. I think you can separate quite well but you may loose synergies. The question is then simply, what is actually needed. And in this case I think it's "just" the Hessian estimate, right?
We've implemented this in zfit actually.

The advantage of the big picture is that this allows to use other minimizers such as scipy which do not need to implement error methods (if we separate the two). Otherwise, any minimizer would need to implement its own error method which is more or less identical and would be a large hurdle to be used.

Scipy is complex because it offers a lot of knobs and tweaks and a lot of functionality. It is not overloaded IMHO. Try to do the same as scipy does with less API and you will see. I don't think you can simplify the API a lot without also cutting away functionality.

I don't know these, but they probably do less. Do they allow for linear and non-linear constraints, for example?

They do less, however I meant the principle that you do:

minimizer = MyMinimizerMethod(minimizer_specific_arg1, minimizer_specific_arg2,...)
minimizer.minimizer(cost, constr)

so that you move the type of the minimizer and the options specific to the minimimzer in the instantiation (which than can look very different for different minimizers) and you're left with a more general and lean minimize method. I think it's a convenient split.

Just to get the overall picture right: what we try to achieve with zfit is to have this well defined interfaces, not just of a minimizer, but also of a fit result (including error calculation), cost function, pdf. Not every detail. But at least many attributes. In order to converge as much as possible and avoid that everyone has their own wrapper of scipy, nlopt etc.

This would also apply to maybe think about the cost functions in iminuit; it would be great if we can also e.g. converge to a nice API; make things just more exchangeable without the need to write a ton of wrappers. If we don't converge, all fine. But many things seem to me just a convention that does not make an actual difference (compare e.g. the attributes of the Params object in iminuit, it is mostly identical with the zfit.Parameter, by coincidence). So the goal is to converge as much and provide an in-itself consistent API that allows for a neat workflow and to exchange parts.

A problem that currently e.g. arises is the output of the fit which contains this Params class: either we wrap it to have it consistently across minimizers or we use it to also fill in other minimizers. Then, however, you implicitly created a "standard"; and this is what we try to avoid and discuss as much as possible to converge as well as possible.

P.S: we're very grateful about the comments and I do fully agree that e.g. iminuit will have it's own API (and the 2.0 break was great!), especially with the uncertainties. But this is just about a discussion on a more general case and your expertise and opinion is very welcome.

jonas-eschle added the API discussion Definition of API and discussion label Nov 28, 2020

jonas-eschle mentioned this issue Dec 7, 2020

Participate in iminuit v2.0 beta? zfit/zfit#278

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API: Minimizer #1

API: Minimizer #1

jonas-eschle commented Nov 28, 2020 •

edited

Loading

HDembinski commented Dec 9, 2020 •

edited

Loading

jonas-eschle commented Dec 9, 2020 •

edited

Loading

HDembinski commented Dec 10, 2020 •

edited

Loading

jonas-eschle commented Dec 10, 2020 •

edited

Loading

API: Minimizer #1

API: Minimizer #1

Comments

jonas-eschle commented Nov 28, 2020 • edited Loading

Questions

Statefullness

API

minimize(loss, params)

minimize_step

Returned result

HDembinski commented Dec 9, 2020 • edited Loading

jonas-eschle commented Dec 9, 2020 • edited Loading

HDembinski commented Dec 10, 2020 • edited Loading

jonas-eschle commented Dec 10, 2020 • edited Loading

jonas-eschle commented Nov 28, 2020 •

edited

Loading

`minimize(loss, params)`

`minimize_step`

HDembinski commented Dec 9, 2020 •

edited

Loading

jonas-eschle commented Dec 9, 2020 •

edited

Loading

HDembinski commented Dec 10, 2020 •

edited

Loading

jonas-eschle commented Dec 10, 2020 •

edited

Loading