PyTorch migration: Remove tensorflow components, add FATE estimators #164

timokau · 2020-10-09T15:36:50Z

Description

See this comment for a description of the current status.

Motivation and Context

Tensorflow 1 is deprecated and we need to move away from it. This PR is an attempt to evaluate pytorch as an alternative. ~~For now I don't try to fit the existing API (at least not yet).~~

How Has This Been Tested?

Lints & tests.

Does this close/impact existing issues?

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

My code follows the code style of this project.
My change requires a change to the documentation.
I have updated the documentation accordingly.
I have added tests to cover my changes.

kiudee · 2020-10-13T08:06:40Z

Already looks very clean - well done!

timokau · 2020-10-29T13:41:52Z

Just as a little status update, since this has been going on for a while: I think this is turning out quite nicely. I have implemented FETA ranking and (nearly, not using the proper loss function yet) FETA discrete choice. That shows flexibility on one axis (result type). I plan to also implement the same for FATE to show the flexibility on the other axis and finish the proof of concept. At that point we could evaluate and see where to take it from there.

So in summary, things are moving along but are not quite ready for review/discussion yet. Hopefully soon-ish.

timokau · 2020-10-31T16:28:05Z

Okay, I think this is sufficient as a proof of concept now. I have implemented FATE and FETA, each in the ranking and discrete choice variant.

I replaced a lot of the inheritance in the current tensorflow implementation with composition. I have split the code into "scoring modules" and estimators. The scoring modules are themselves composed of smaller modules, which makes them easier to reuse/understand/test.

I have based the estimator implementation on skorch, which takes care of a lot of the boilerplate for us. We no longer have to care about training loops, instantiating optimizers or passing the parameters to uninitialized classes. We get #116 basically for free.

The actual "heavy lifing" of the computation (the pairwise utilities) is disentangled from the FETA/FATE architecture (the "data flow" part), so its easy to modify or replace. For now its just a simple 1-layer linear network. This decomposition of scorer/estimator/utility removes a lot of duplication. It would be very easy to add a new scorer (for example based on graph neural networks) and "throw" it at the existing Ranking/Discrete choice estimators. It would also be very easy to derive a new utility function architecture and "throw" that at the FATE module.

If you want to look at the implementation, here are the most interesting files:

modules/scoring.py takes care of the high level assembly for the FATE and FETA scoring modules.
estimators.py derives ranking and discrete choice estimators from the scorers.
*_losses.py, *_datasets.py defines some losses and test datasets. modules/feta_support.py and modules/embedding.py are contain the more low-level aspects of FETA and FATE.

What do you think @kiudee? There are still things to improve of course, but I think its sufficient as a proof of concept.

timokau · 2020-10-31T16:28:42Z

Also CC @prithagupta if you are interested in this.

poc/modules/scoring.py

kiudee · 2020-11-02T16:59:54Z

poc/modules/scoring.py

+            instances
+        )
+        pairs = torch.cat((instances, context_per_object), dim=-1)
+        utilities = self.pairwise_utility_module(pairs)


Very clean decomposition.

kiudee · 2020-11-03T09:13:34Z

poc/modules/scoring.py

+    # TODO use a more powerful pairwise utility module
+    def __init__(self, n_features, pairwise_utility_module=PairwiseLinearUtility):
+        super().__init__()
+        self.mean_aggregated_utilty = MeanAggregatedUtility(


In principle we can even think about making this modular. One could use different aggregation functions here or even a learned aggregation operator.

Yes, I think that would be interesting :) The composition architecture should make experiments like that much easier.

timokau · 2020-11-04T16:31:10Z

What is your general verdict @kiudee? Should I continue down this path, implementing more of the existing learners and functionality and eventually replacing the current implementation? Or rather try something else?

poc/datasets/variable_choice_datasets.py

I think default values for internal functions just hinder understanding. Changed the parameter names to be less domain specific, since we are just talking about a point in the ball for the purposes of this function. Since this is an internal function, we can require an already initialized random state. Result of this discussion / explanation: kiudee#164 (comment)

Thereby fixing a bug when the number of instances is not a multiple of 10. Result of this discussion kiudee#164 (comment)

I think default values for internal functions just hinder understanding. Changed the parameter names to be less domain specific, since we are just talking about a point in the ball for the purposes of this function. Since this is an internal function, we can require an already initialized random state. Result of this discussion / explanation: kiudee#164 (comment)

Thereby fixing a bug when the number of instances is not a multiple of 10. Result of this discussion kiudee#164 (comment)

timokau · 2020-11-24T20:28:18Z

Another status update: I'm experimenting with experiments. We should be able to reproduce the experiments of the main papers with the new implementation, and I'd like to be able to do that in an easily reproducible way (that could possibly be repeated on each release). I'm trying to use Sacred for the purpose. I'm abusing the "named configuration" system a bit, but currently you can do things like

python3 experiment.py -m sacred with feta_variable_choice_estimator pareto_choice_problem dataset_params.n_instances=10000

you can pick "named configurations" for an estimator and a dataset. You can then overwrite all parameters on the command line. Sacred will run the experiment, store everything that is needed to reproduce it and also store the results in a database:

timokau · 2020-12-01T20:33:43Z

Some more progress: I've added some metrics and played with the experiments a bit. Here I was trying to see how far I could push the current feta implementation with its defaults and just 1000 pareto instances (which was further than expected):

I ended up stopping the training even though the informedness still seemed to be rising very slightly. I ran the experiment with

python3 poc/experiment.py -m sacred with feta_variable_choice_estimator pareto_choice_problem dataset_params.n_instances=1000 dataset_params.n_objects=30

I also created an upstream PR for the Sacred logger for skorch: skorch-dev/skorch#725

poc/experiment.py

timokau · 2020-12-10T17:06:20Z

poc/experiment.py

+        "n_instances": int(1e5),
+    }
+    dataset_type = "variable_choice"
+    # TODO set cross-validation parameters as in the paper


I'm also not sure if I understand the training and validation strategy of "Leaning Choice Functions" correctly. I understand that it uses and outer 5-fold cross validation for hyper-parameter optimization, but I'm not sure how the "inner" validation works. From Table 3 I would guess that it works like this:

Generate (110000 / 4) * 5 = 137,500 instances (to get that 100,000 + 10,000 split in each cross-validation step).

Split this into 5 folds of 27,500 instances each.

For each fold F (outer cross-validation loop):

Combine the other 4 folds into one dataset of 110,000 instances.

Split this into training (100,000) and test (10,000) data

For each set of hyperparameters

Train a model, test it on the test data.

Validate the best (according to test data) model on fold F

Report the average performance of the outer cross-validation.

However I'm almost sure that is incorrect 😄 Could you clarify @kiudee?

Python 3.7 is not officially supported anymore. Python 3.9 is released already, but let's update to 3.8 first.

In preparation for the pytorch migration.

In preparation of adding new entries to the list.

timokau · 2021-04-08T10:26:09Z

I have fixed the copy and paste issue that you found.

Just for completeness, I'll summarize the results of our private discussions here too:

I have removed the poetry2nix specific workaround.
I have restructured and reduced the history to avoid adding components just to remove them again in the same PR (related to the first open question here). Sorry for all the email notifications and force pushes.
I have "lifted" the level of abstraction of the specialized estimator classes (related to the second open question here).
I have removed the train/validation split in the tests (related to this comment).

The PR is ready for review again.

Edit: I forgot to mention the module names. We also discussed alternatives for the names of the first_order and zeroth_order module. In the end I decided to go with instance_reduction and object_mapping.

kiudee

Great clean commit history. Looks ready to be merged.

timokau · 2021-04-08T10:40:40Z

Thank you for the reviews :)

This is part of the ongoing pytorch migration. We will use skorch as the basis for our pytorch estimators. That will make it easier to be compliant with the scikit-learn estimator API. Ranking and (general/discrete) choice estimators are often based on some sort of scoring. The task specific estimators make it easy to derive concrete estimators from a scoring module.

This adds a scoring module and the derived estimators for the FATE approach. The architecture is modular, so it should be easy to experiment with new ways to put the estimators together. This is a big commit. Multiple smaller ones that add the separate components (some of which are structural or can be useful outside of FATE and therefore could be considered features on their own) would probably have been better. Splitting it up now would take more time and is not worth it in this case though.

This simplifies interchangeable use of pytorch estimators and other estimators.

The "linear" implementations have been removed. The existing estimators do not expect `epochs` or `validation_split` parameters. The `verbose` parameter is accepted by some estimators, but defaults to `False` and is not expected by any of the ranking or discrete choice estimators.

The configuration is based on the configuration of the old (tensorflow based) fate estimators in the tests. The tensorflow tests used a 10% validation split, but still verified the performance in-sample. The validation was not actually used. Therefore I haven't kept that behavior. The performance isn't the same. Especially the performance on the choice task seems worse if we trust the test results. We shouldn't read too much into that yet. The test is mostly for basic functionality and not a reliable performance indicator. The sample size is small.

The binder logo (badge.svg vs badge_logo.svg) differs between the two files, but either should be good.

There is now a pytorch implementation of the FATE estimators.

This is similar to the "optimizer_common_args" dictionary that used to exist. This version contains skorch-specific arguments, which also includes the train split and the number of epochs. There are only the FATE based estimators now, but this would get repetitive when the other approaches are included again.

timokau · 2021-04-08T10:54:35Z

I wanted to run the checks and lints once more before merging and noticed some formatting issues. I did not run black consistently because I had a newer version in my environment and that made a lot of unrelated formatting changes. I have fixed the formatting with the black version that is defined in the .pre-commt-config.yaml. The linters (including black) give a thumbs-up now and the test suite passes.

Please take another look.

timokau · 2021-04-08T11:04:54Z

Thanks again 🚀

timokau marked this pull request as draft October 9, 2020 15:37

timokau changed the title ~~Pytorch poc~~ Proof of Concept: FETA in pytorch Oct 9, 2020

timokau force-pushed the pytorch-poc branch from 91d81d6 to cbd9e9e Compare October 9, 2020 15:39

timokau mentioned this pull request Oct 15, 2020

Migrate away from tf1 #125

Closed

timokau force-pushed the pytorch-poc branch from cbd9e9e to 74ad5e7 Compare October 19, 2020 11:38

timokau force-pushed the pytorch-poc branch from 784650e to 615aaa9 Compare October 31, 2020 15:36

timokau changed the title ~~Proof of Concept: FETA in pytorch~~ Proof of Concept: FATE & FETA ranking and discrete choice in pytorch Oct 31, 2020

kiudee reviewed Nov 2, 2020

View reviewed changes

poc/modules/scoring.py Outdated Show resolved Hide resolved

kiudee reviewed Nov 2, 2020

View reviewed changes

kiudee reviewed Nov 3, 2020

View reviewed changes

timokau force-pushed the pytorch-poc branch from 9f5cb49 to c3c2465 Compare November 11, 2020 10:03

timokau commented Nov 12, 2020

View reviewed changes

poc/datasets/variable_choice_datasets.py Outdated Show resolved Hide resolved

timokau force-pushed the pytorch-poc branch 2 times, most recently from 6911866 to e08acfc Compare November 18, 2020 18:17

timokau commented Nov 18, 2020

View reviewed changes

poc/datasets/variable_choice_datasets.py Outdated Show resolved Hide resolved

kiudee reviewed Nov 19, 2020

View reviewed changes

poc/datasets/variable_choice_datasets.py Outdated Show resolved Hide resolved

timokau added a commit to timokau/cs-ranking that referenced this pull request Nov 19, 2020

Generate a unique centroid per pareto instance

a675204

Thereby fixing a bug when the number of instances is not a multiple of 10. Result of this discussion kiudee#164 (comment)

timokau mentioned this pull request Nov 19, 2020

Pareto: Genereate unique centroids, documentation, test #177

Merged

7 tasks

timokau added a commit to timokau/cs-ranking that referenced this pull request Nov 19, 2020

Generate a unique centroid per pareto instance

1509511

Thereby fixing a bug when the number of instances is not a multiple of 10. Result of this discussion kiudee#164 (comment)

timokau commented Dec 10, 2020

View reviewed changes

poc/experiment.py Outdated Show resolved Hide resolved

timokau commented Dec 10, 2020

View reviewed changes

timokau added 5 commits April 8, 2021 10:44

Update to python 3.8

ae7e44c

Python 3.7 is not officially supported anymore. Python 3.9 is released already, but let's update to 3.8 first.

Add torch dependency

8dcd6b3

In preparation for the pytorch migration.

Add skorch dependency

614cd54

Update the dependencies in the README

a009ef9

Sort estimator class listings alphabetically

ddeb39d

In preparation of adding new entries to the list.

timokau force-pushed the pytorch-poc branch from 2d5d425 to 7cf35f2 Compare April 8, 2021 10:10

Add pytorch losses and metrics

ffaa488

timokau force-pushed the pytorch-poc branch from b43b25d to d136958 Compare April 8, 2021 10:20

timokau requested a review from kiudee April 8, 2021 10:26

kiudee approved these changes Apr 8, 2021

View reviewed changes

timokau added 11 commits April 8, 2021 12:45

Prepare for pytorch tests

b48bba1

Always use 32 bit floats in tests

4a46a3d

This simplifies interchangeable use of pytorch estimators and other estimators.

Remove star imports in tests

14a6aee

Deduplicate the README

63439d1

The binder logo (badge.svg vs badge_logo.svg) differs between the two files, but either should be good.

Add a changelog entry for the pytorch migration

553eaaf

Mark FATE as available again in the README

6d14786

There is now a pytorch implementation of the FATE estimators.

timokau force-pushed the pytorch-poc branch from 279cfc1 to 0eab5b1 Compare April 8, 2021 10:54

kiudee self-requested a review April 8, 2021 11:03

kiudee approved these changes Apr 8, 2021

View reviewed changes

timokau merged commit ac836cb into kiudee:pytorch-migration Apr 8, 2021

timokau deleted the pytorch-poc branch April 8, 2021 11:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PyTorch migration: Remove tensorflow components, add FATE estimators #164

PyTorch migration: Remove tensorflow components, add FATE estimators #164

timokau commented Oct 9, 2020 •

edited

Loading

kiudee commented Oct 13, 2020

timokau commented Oct 29, 2020 •

edited

Loading

timokau commented Oct 31, 2020

timokau commented Oct 31, 2020

kiudee Nov 2, 2020

timokau Nov 4, 2020

kiudee Nov 3, 2020

timokau Nov 4, 2020

timokau commented Nov 4, 2020

timokau commented Nov 24, 2020 •

edited

Loading

timokau commented Dec 1, 2020

timokau Dec 10, 2020

timokau commented Apr 8, 2021 •

edited

Loading

kiudee left a comment

timokau commented Apr 8, 2021

timokau commented Apr 8, 2021

timokau commented Apr 8, 2021

PyTorch migration: Remove tensorflow components, add FATE estimators #164

PyTorch migration: Remove tensorflow components, add FATE estimators #164

Conversation

timokau commented Oct 9, 2020 • edited Loading

Description

Motivation and Context

How Has This Been Tested?

Does this close/impact existing issues?

Types of changes

Checklist:

kiudee commented Oct 13, 2020

timokau commented Oct 29, 2020 • edited Loading

timokau commented Oct 31, 2020

timokau commented Oct 31, 2020

kiudee Nov 2, 2020

Choose a reason for hiding this comment

timokau Nov 4, 2020

Choose a reason for hiding this comment

kiudee Nov 3, 2020

Choose a reason for hiding this comment

timokau Nov 4, 2020

Choose a reason for hiding this comment

timokau commented Nov 4, 2020

timokau commented Nov 24, 2020 • edited Loading

timokau commented Dec 1, 2020

timokau Dec 10, 2020

Choose a reason for hiding this comment

timokau commented Apr 8, 2021 • edited Loading

kiudee left a comment

Choose a reason for hiding this comment

timokau commented Apr 8, 2021

timokau commented Apr 8, 2021

timokau commented Apr 8, 2021

timokau commented Oct 9, 2020 •

edited

Loading

timokau commented Oct 29, 2020 •

edited

Loading

timokau commented Nov 24, 2020 •

edited

Loading

timokau commented Apr 8, 2021 •

edited

Loading