Adding Multi-Task ElasticNet support #194

PABannier · 2022-01-19T19:12:32Z

The goal of this PR is to add multi-task ElasticNet to the elasticnet crate.

A quick roadmap:

Write block coordinate descent
Write dual gap for multi-task
Write tests for BCD
Make CD variable names consistent with variable names in BCD
Use a for loop when updating the residuals in CD
Z-score + confidence level + variance
Adapt Linfa ElasticNet API to the multi-task case
Write tests for ElasticNet multi-task

PABannier · 2022-01-19T21:11:59Z

@YuhanLiin i'm implementing the Fit trait for the multi-task ENET. However, I don't know how to deal with Linfa API to handle the multi-task case. I created a MultiTaskElasticNet struct, but since both ElasticNet and MultiTaskElasticNet have the same set of parameters, I didn't create a MultiTaskElasticNetValidParams.

My question is how to restrict the trait bounds to implement the Fit trait for multi-task dataset?

algorithms/linfa-elasticnet/src/algorithm.rs

src/dataset/mod.rs

algorithms/linfa-elasticnet/src/algorithm.rs

algorithms/linfa-elasticnet/src/lib.rs

algorithms/linfa-elasticnet/src/algorithm.rs

YuhanLiin · 2022-01-23T01:04:12Z

Is it possible to make functions like coordinate_descent, duality_gap, variance_params, and compute_intercept generic across 1D and 2D arrays? That way we won't need 2 sets of similar helper functions to handle the single-task and multi-task cases.

PABannier · 2022-01-23T18:08:21Z

@YuhanLiin Thanks for all the remarks! I passed your comments on the latest commit.

As for making coordinate_descent generic across 1D and 2D arrays, I don't think it is a good idea. coordinate_descent and block_coordinate_descent rely on two different proximal operators and there are more for loops in block_coordinate_descent specifically designed for the multi-task case. IMHO, a wiser choice would be to let coordinate_descent and block_coordinate_descent separate and use them as backbones for different penalties in the single task or multi task case. My intuition is that in future PR we could make it such that these two functions are generic over the regularization used in a model (for now it only supports L1 + L2), but there are more complex penalties that are very useful as well (non-convex ones for instance, see SCAD or MCP).

duality_gap is also very specific in the single task or multi-task case and introducing a match syntax would make things more confusing IMHO.

For variance_params and compute_intercept I can certainly make it generic over 1D or 2D arrays.

YuhanLiin · 2022-01-31T01:59:35Z

algorithms/linfa-elasticnet/src/algorithm.rs

+                for i in 0..x.shape()[0] {
+                    for t in 0..n_tasks {
+                        r[[i, t]] += x_j[i] * old_w_j[t];
+                    }
+                }


Isn't this operation equivalent to r += x_j.dot(old_w_j.t())? If so you can replace these types of for loops with general_mat_mul(1, x_j, old_w_j.t(), 1, r).

This operation is equivalent to np.outer in Python (see: https://numpy.org/doc/stable/reference/generated/numpy.outer.html for a more detailed explanation). I don't think there is a built-in equivalent in ndarray. It is not yet available in the ndarray crate (see: rust-ndarray/ndarray#1148) .

outer(x, w) is equivalent to x as a column vector multiplied with w as a row vector. The code in the ndarray PR you linked does the same thing. Using general_mat_mul allows you to add the matrix product of x and w to r in one operation. You do need to convert x and y into 2D arrays though, like in the PR.

Ideal way to convert 1D arrays into 2D is insert_axis. Something like x.view().insert_axis(Axis(0_or_1))

algorithms/linfa-elasticnet/src/algorithm.rs

YuhanLiin · 2022-01-31T02:11:56Z

Making variance_params and compute_intercept generic would be great.

YuhanLiin · 2022-02-20T06:02:53Z

algorithms/linfa-elasticnet/src/algorithm.rs

+    let norm_cols_x = x.map_axis(Axis(0), |col| col.dot(&col));
+    let mut gap = F::one() + tol;
+    let d_w_tol = tol;
+    let tol = tol * y.fold(F::zero(), |sum, &y_ij| sum + y_ij.powi(2));


Instead of fold you can do y.iter().map(|x| x*x).sum(). Since iter is called before map it won't create a new array. For 1D arrays it's even simpler since you can just call y.dot(&y) to dot product y with itself.

You should also apply this change to all the other similar fold calls

algorithms/linfa-elasticnet/src/algorithm.rs

PABannier · 2022-02-20T12:54:26Z

algorithms/linfa-elasticnet/src/algorithm.rs

+    gap
+}
+
+fn variance_params<F: Float + Lapack, T: AsTargets<Elem = F>, D: Data<Elem = F>, I: Dimension>(


This AsTargets<Elem = F> is very complex to deal with. I can't do much with it, since I need to have an ArrayBase in order to call ndim(), shape() and to make the computation target - y_est. Do you know how I can circumvent this issue? I don't understand the need for an AsTarget trait in the first place. At least it should support multi-task targets.

Actually it should have some way to retrieve the dimension of the targets.

Currently AsTargets has the method as_multi_targets, which returns a 2D array view, so you can call it to retrieve the target for both cases. For the single target case this returns an array of dimension (n, 1). This means your code needs to treat single-task and multi-task-with-only-one-task as equivalent cases. y_est will need to be a 2D array in all cases; for single-task just insert Axis(1) into y_est.

After your other PR, this should be bounded with AsMultiTargets (now that I think about it, we need AsMultiTargets as a super-trait of AsSingleTarget for this to work).

algorithms/linfa-elasticnet/src/algorithm.rs

YuhanLiin · 2022-03-03T01:14:13Z

algorithms/linfa-elasticnet/src/algorithm.rs

@@ -429,7 +426,7 @@ fn duality_gap<'a, F: Float>(
    } else {
        (F::one(), r_norm2)
    };
-    let l1_norm = w.fold(F::zero(), |sum, w_i| sum + w_i.abs());
+    let l1_norm = w.map(|w_i| w_i.abs()).sum();


YuhanLiin · 2022-03-03T01:14:52Z

algorithms/linfa-elasticnet/src/algorithm.rs

-    let w_norm2 = w.fold(F::zero(), |sum, &wij| sum + wij.powi(2));
+    let r_norm2 = r.map(|rij| rij.powi(2)).sum();
+    let w_norm2 = w.map(|wij| wij.powi(2)).sum();


Call iter() before map to prevent creating a new array

algorithms/linfa-elasticnet/src/algorithm.rs

…elastic_net

PABannier · 2022-03-20T13:35:58Z

Since #206 has been merged, ElasticNet is now easier to adapt to the multi-task case. I'm still working on it.
Roadmap before merging:

Make variance_params generic for single task and multi-task
Make compute_intercept generic for single task and multi-task
Write tests for MultiTaskElasticNet (Lasso + Ridge...)

YuhanLiin · 2022-08-14T21:12:21Z

Work continued in #238

PABannier added 8 commits January 19, 2022 13:55

added block coordinate descent function

81d0f1e

added duality_gap_mtl computation

7e8408e

ENH cd pass to be consistent with bcd

1b69b8e

added prox operator for MTL Enet

e5b76e7

added helper functions for tests

196a2a5

fix failing CD tests

dc91b62

working ent mtl penalties

3bb5333

bcd lower objective test pass

f9f9959

PABannier marked this pull request as draft January 19, 2022 19:14

PABannier added 3 commits January 19, 2022 20:28

added MultiTaskEnet struct

e23d876

added MTENET documentation

f316c95

added API MTENET

1ae6522

PABannier mentioned this pull request Jan 19, 2022

MultiTaskLasso, GroupLasso and Lasso-like models #191

Closed

added variance, z-score, conf interval for multitask ENET

f13a150

PABannier commented Jan 19, 2022

View reviewed changes

algorithms/linfa-elasticnet/src/algorithm.rs Outdated Show resolved Hide resolved

PABannier added 2 commits January 22, 2022 15:20

added multi-task estimators

b347963

added tests for MTL

80d9a02

YuhanLiin reviewed Jan 22, 2022

View reviewed changes

src/dataset/mod.rs Outdated Show resolved Hide resolved

YuhanLiin reviewed Jan 22, 2022

View reviewed changes

algorithms/linfa-elasticnet/src/algorithm.rs Outdated Show resolved Hide resolved

YuhanLiin reviewed Jan 22, 2022

View reviewed changes

algorithms/linfa-elasticnet/src/algorithm.rs Show resolved Hide resolved

YuhanLiin reviewed Jan 22, 2022

View reviewed changes

algorithms/linfa-elasticnet/src/algorithm.rs Show resolved Hide resolved

YuhanLiin reviewed Jan 22, 2022

View reviewed changes

algorithms/linfa-elasticnet/src/algorithm.rs Outdated Show resolved Hide resolved

YuhanLiin reviewed Jan 22, 2022

View reviewed changes

algorithms/linfa-elasticnet/src/algorithm.rs Outdated Show resolved Hide resolved

YuhanLiin reviewed Jan 23, 2022

View reviewed changes

algorithms/linfa-elasticnet/src/lib.rs Show resolved Hide resolved

YuhanLiin reviewed Jan 23, 2022

View reviewed changes

algorithms/linfa-elasticnet/src/algorithm.rs Outdated Show resolved Hide resolved

PABannier added 2 commits January 23, 2022 19:08

pass comments

3d981ef

CLN files

36f4b70

YuhanLiin reviewed Jan 31, 2022

View reviewed changes

algorithms/linfa-elasticnet/src/algorithm.rs Outdated Show resolved Hide resolved

YuhanLiin reviewed Jan 31, 2022

View reviewed changes

algorithms/linfa-elasticnet/src/algorithm.rs Outdated Show resolved Hide resolved

YuhanLiin reviewed Jan 31, 2022

View reviewed changes

algorithms/linfa-elasticnet/src/algorithm.rs Outdated Show resolved Hide resolved

PABannier added 4 commits February 19, 2022 19:45

changed map into fold

a21bc1a

added tests for Enet and MTL

7f32afc

added incorrect target shape

6c56e06

WIP: made variance params generic over the number of tasks

a07fb39

YuhanLiin reviewed Feb 20, 2022

View reviewed changes

added z_score and confidence_95th for MTL

75023d9

PABannier commented Feb 20, 2022

View reviewed changes

algorithms/linfa-elasticnet/src/algorithm.rs Outdated Show resolved Hide resolved

algorithms/linfa-elasticnet/src/algorithm.rs Outdated Show resolved Hide resolved

PABannier commented Feb 20, 2022

View reviewed changes

PABannier mentioned this pull request Feb 20, 2022

Split AsTarget and AsTargetMut into 1D and 2D traits #195

Closed

PABannier added 3 commits February 20, 2022 14:35

map instead of fold

ba3a574

fix confidence interval and z-score

114fc03

converted back fold to map

30744c0

YuhanLiin reviewed Mar 3, 2022

View reviewed changes

algorithms/linfa-elasticnet/src/algorithm.rs Outdated Show resolved Hide resolved

YuhanLiin reviewed Mar 3, 2022

View reviewed changes

algorithms/linfa-elasticnet/src/algorithm.rs Outdated Show resolved Hide resolved

YuhanLiin reviewed Mar 3, 2022

View reviewed changes

algorithms/linfa-elasticnet/src/algorithm.rs Outdated Show resolved Hide resolved

YuhanLiin reviewed Mar 3, 2022

View reviewed changes

algorithms/linfa-elasticnet/src/algorithm.rs Outdated Show resolved Hide resolved

YuhanLiin reviewed Mar 3, 2022

View reviewed changes

algorithms/linfa-elasticnet/src/algorithm.rs Outdated Show resolved Hide resolved

PABannier added 3 commits March 3, 2022 19:57

pass comments

3b5f37d

Merge branch 'master' of https://github.com/PABannier/linfa into mtl_…

c7de3e2

…elastic_net

WIP make compute_variance generic over the dimension

bc1ad06

YuhanLiin mentioned this pull request Aug 14, 2022

Adding Multi-Task ElasticNet support #238

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding Multi-Task ElasticNet support #194

Adding Multi-Task ElasticNet support #194

PABannier commented Jan 19, 2022 •

edited

Loading

PABannier commented Jan 19, 2022

YuhanLiin commented Jan 23, 2022

PABannier commented Jan 23, 2022

YuhanLiin Jan 31, 2022

PABannier Feb 19, 2022

YuhanLiin Feb 20, 2022

YuhanLiin Feb 20, 2022

YuhanLiin commented Jan 31, 2022

YuhanLiin Feb 20, 2022

YuhanLiin Feb 20, 2022

PABannier Feb 20, 2022 •

edited

Loading

PABannier Feb 20, 2022

YuhanLiin Feb 20, 2022

YuhanLiin Mar 3, 2022

YuhanLiin Mar 3, 2022

PABannier commented Mar 20, 2022

YuhanLiin commented Aug 14, 2022

Adding Multi-Task ElasticNet support #194

Are you sure you want to change the base?

Adding Multi-Task ElasticNet support #194

Conversation

PABannier commented Jan 19, 2022 • edited Loading

PABannier commented Jan 19, 2022

YuhanLiin commented Jan 23, 2022

PABannier commented Jan 23, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

YuhanLiin commented Jan 31, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

PABannier Feb 20, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

PABannier commented Mar 20, 2022

YuhanLiin commented Aug 14, 2022

PABannier commented Jan 19, 2022 •

edited

Loading

PABannier Feb 20, 2022 •

edited

Loading