Regularization #3

rossdiener · 2018-02-23T18:42:44Z

It should be straightforward to implement an L2 regularization of linear ordinal regression. Doing so for L1 and/or then writing tests will be more challenging.

stevenwu4 · 2018-02-23T19:52:51Z

Paper on the implementation of regularizing ordinal models in R: https://arxiv.org/pdf/1706.05003.pdf

RobeeF · 2020-06-02T14:06:15Z

Hi,
I am very interested in a LASSO ordinal logistic regression implementation.
Is it on the agenda ?
If not, I will be happy to contribute to help implementing it!

rossdiener · 2020-06-02T14:55:37Z

Hy @RobeeF! Welcome to bevel. Right now I'm a bit busy and I don't have any plans to implement regularization. However, I'd love to see this project gain some momentum and highly encourage you to contribute. If you do, I can commit to reviewing the pull requests and addressing questions or issues you might encounter.

Just a thought: Rather than LASSO, it's probably easier to implement an L2 regularization first. A lot of the code in Bevel relies on explicit (hand calculated) formulas for the derivative of the ordinal regression loss function. There happens to be a simple formula for the derivative of the L2 loss function, so that would be much easier to add to existing code.

RobeeF · 2020-06-03T08:16:20Z

Hi @rossdiener,
I get the point of starting with L2 regularization.
For gradient computing, have you tried to rely on automatic differentiation tools rather than hand-calculated gradients ?
I can see that you are using numdifftools to compute the Jacobian but not for log-likelihood gradient computing.
Is there a reason for this ?

rossdiener · 2020-06-04T01:21:59Z

Hey @RobeeF - Seems like you're pretty familiar with the codebase. That's awesome.

The reason we didn't use automatic differentiation tools is because there is a general formula for the derivative of the log likelihood for linear ordinal regression. We might as well use the formula since it's always going to compute faster than a numerical tool and it's immune to numerical instability.

The formula for the second derivative exists, and in an ideal world we would calculate it by hand and implement it explicitly according to the formulas. However, it's a pain in the ass to do by hand because the formulas are messy and there's a lot of them (a whole hessian matrix). So we numerically differentiate the explicit formula for first derivative using numdifftools to get the second derivative.

RobeeF · 2020-06-04T07:46:19Z

Hi @rossdiener,
In my experience, autodiff tools computations can be faster than the actual handwritten gradient (even if this gradient is explicit).

Maybe I will give it a shot and send you the running time comparison. It could also enable to develop new code faster !

Have a nice day

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regularization #3

Regularization #3

rossdiener commented Feb 23, 2018

stevenwu4 commented Feb 23, 2018 •

edited

Loading

RobeeF commented Jun 2, 2020

rossdiener commented Jun 2, 2020

RobeeF commented Jun 3, 2020

rossdiener commented Jun 4, 2020

RobeeF commented Jun 4, 2020

Regularization #3

Regularization #3

Comments

rossdiener commented Feb 23, 2018

stevenwu4 commented Feb 23, 2018 • edited Loading

RobeeF commented Jun 2, 2020

rossdiener commented Jun 2, 2020

RobeeF commented Jun 3, 2020

rossdiener commented Jun 4, 2020

RobeeF commented Jun 4, 2020

stevenwu4 commented Feb 23, 2018 •

edited

Loading