Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regularization #3

Open
rossdiener opened this issue Feb 23, 2018 · 6 comments
Open

Regularization #3

rossdiener opened this issue Feb 23, 2018 · 6 comments

Comments

@rossdiener
Copy link
Contributor

It should be straightforward to implement an L2 regularization of linear ordinal regression. Doing so for L1 and/or then writing tests will be more challenging.

@stevenwu4
Copy link

stevenwu4 commented Feb 23, 2018

Paper on the implementation of regularizing ordinal models in R: https://arxiv.org/pdf/1706.05003.pdf

@RobeeF
Copy link

RobeeF commented Jun 2, 2020

Hi,
I am very interested in a LASSO ordinal logistic regression implementation.
Is it on the agenda ?
If not, I will be happy to contribute to help implementing it!

@rossdiener
Copy link
Contributor Author

Hy @RobeeF! Welcome to bevel. Right now I'm a bit busy and I don't have any plans to implement regularization. However, I'd love to see this project gain some momentum and highly encourage you to contribute. If you do, I can commit to reviewing the pull requests and addressing questions or issues you might encounter.

Just a thought: Rather than LASSO, it's probably easier to implement an L2 regularization first. A lot of the code in Bevel relies on explicit (hand calculated) formulas for the derivative of the ordinal regression loss function. There happens to be a simple formula for the derivative of the L2 loss function, so that would be much easier to add to existing code.

@RobeeF
Copy link

RobeeF commented Jun 3, 2020

Hi @rossdiener,
I get the point of starting with L2 regularization.
For gradient computing, have you tried to rely on automatic differentiation tools rather than hand-calculated gradients ?
I can see that you are using numdifftools to compute the Jacobian but not for log-likelihood gradient computing.
Is there a reason for this ?

@rossdiener
Copy link
Contributor Author

Hey @RobeeF - Seems like you're pretty familiar with the codebase. That's awesome.

The reason we didn't use automatic differentiation tools is because there is a general formula for the derivative of the log likelihood for linear ordinal regression. We might as well use the formula since it's always going to compute faster than a numerical tool and it's immune to numerical instability.

The formula for the second derivative exists, and in an ideal world we would calculate it by hand and implement it explicitly according to the formulas. However, it's a pain in the ass to do by hand because the formulas are messy and there's a lot of them (a whole hessian matrix). So we numerically differentiate the explicit formula for first derivative using numdifftools to get the second derivative.

@RobeeF
Copy link

RobeeF commented Jun 4, 2020

Hi @rossdiener,
In my experience, autodiff tools computations can be faster than the actual handwritten gradient (even if this gradient is explicit).

Maybe I will give it a shot and send you the running time comparison. It could also enable to develop new code faster !

Have a nice day

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants