Renyi divergence #769

jbregli · 2017-09-27T13:45:18Z

Here is an implementation of the Renyi divergence variational inference.
There's also an example on VAEs.

Here is a link to the edward forum with some more info:
https://discourse.edwardlib.org/t/renyi-divergence-variational-inference/366/3

ps: Sorry for the quite messy commit history.

dustinvtran

Have to go catch a flight but some preliminary comments:

dustinvtran · 2017-09-27T14:30:21Z

.gitignore

@@ -100,3 +100,9 @@ docs/*.html
 # IDE related
 .idea/
 .vscode/
+


Can you remove changes that aren't relevant for this PR? This includes changes to .gitignore here as well as deletion of CSVs.

dustinvtran · 2017-09-27T14:30:41Z

edward/inferences/renyi_divergence.py

+from edward.util import copy
+
+try:
+    from edward.models import Normal


As convention, we use 2-space indent.

dustinvtran · 2017-09-27T14:30:56Z

edward/inferences/renyi_divergence.py

+from __future__ import print_function
+
+import six
+import numpy as np


As convention, we alphabetize the ordering of the import libraries.

dustinvtran · 2017-09-27T14:31:29Z

edward/inferences/renyi_divergence.py

+        "{0}. Your TensorFlow version is not supported.".format(e))
+
+
+class Renyi_divergence(VariationalInference):


As convention, we use CamelCase for class names.

dustinvtran · 2017-09-27T14:33:32Z

edward/inferences/renyi_divergence.py

+    To perform the optimization, this class uses the techniques from
+    Renyi Divergence Variational Inference (Y. Li & al, 2016)
+
+    # Notes:


Docstrings are parsed as Markdown and formatted in a somewhat specific way as they appear on the API docs. I recommend following the other classes, where you would denote a subsection as #### Notes and when writing bullet points, do, e.g.,

#### Notes + bullet 1 + bullet 2 + maybe bulleted list in a bullet

dustinvtran

Great work! Some comments below. The code looks correct and only minor suggestions with respect to formatting are laid out.

Can you include a unit test? See, e.g., how KLpq is tested under the file tests/inferences/test_klpq.py.

dustinvtran · 2017-09-27T14:44:56Z

edward/inferences/renyi_divergence.py

+    $ \text{D}_{R}^{(\alpha)}(q(z)||p(z \mid x))
+        = \frac{1}{\alpha-1} \log \int q(z)^{\alpha} p(z \mid x)^{1-\alpha} dz $
+
+    To perform the optimization, this class uses the techniques from


Periods at end of sentences. (If you'd look at the generated API for the class, I recommend compiling the website following instructions from docs/.)

dustinvtran · 2017-09-27T14:46:28Z

edward/inferences/renyi_divergence.py

+        = \frac{1}{\alpha-1} \log \int q(z)^{\alpha} p(z \mid x)^{1-\alpha} dz $
+
+    To perform the optimization, this class uses the techniques from
+    Renyi Divergence Variational Inference (Y. Li & al, 2016)


We use bibtex for handling references in docstrings. This is handled by adding the appropriate bib entry to docs/tex/bib.bib; make sure it's also written in the right order: we sort bib entries by their year, then alphabetically according to their citekey within each year.

When using references, you can produce (Li et al., 2016) and Li et al. (2016) by writing [@li2016renyi] and @li2016renyirespectively, assuming thatli2016renyi` is the citekey.

dustinvtran · 2017-09-27T14:49:16Z

edward/inferences/renyi_divergence.py

+
+    # Notes:
+        - Renyi divergence does not have any analytic version.
+        - Renyi divergence does not have any version for non reparametrizable


It does but the gradient estimator in @li2016variational doesn't. I recommend just stating that this inference algorithm is restricted to variational approximations whose random variables all satisfy rv.reparameterization_type == tf.contrib.distributions.FULLY_REPARAMETERIZED.

Also, instead of checking this during build_loss_and_gradients I recommend checking this during the __init__. This sort of check is done statically any graph construction similar to how we check for compatible shapes in all latent variables and data during __init__.

dustinvtran · 2017-09-27T14:49:45Z

edward/inferences/renyi_divergence.py

+
+    def initialize(self,
+                   n_samples=32,
+                   alpha=1.,


As convention, we append all numerics with 0, e.g., 1.0.

dustinvtran · 2017-09-27T14:51:28Z

edward/inferences/renyi_divergence.py

+                Number of samples from variational model for calculating
+                stochastic gradients.
+            alpha: float, optional.
+                Renyi divergence coefficient.


Could be useful to specify the domain of the coefficient. E.g., Must be greater than 0. or etc.

dustinvtran · 2017-09-27T14:54:23Z

edward/inferences/renyi_divergence.py

+                "Variational Renyi inference only works with reparameterizable"
+                " models")
+
+#########


This function is only used in one location and is a one-liner; could you write that line instead of defining a new function?

dustinvtran · 2017-09-27T15:00:21Z

examples/vae_renyi.py

+            scale=Dense(d, activation='softplus')(hidden))
+
+# Bind p(x, z) and q(z | x) to the same TensorFlow placeholder for x.
+inference = Renyi_divergence({z: qz}, data={x: x_ph})


This code looks exactly the same as an older version of vae.py but only differs in this line. To keep the VAE versions better synced, could you add a comment suggesting that this is also an alternative in the existing vae.py?

Ideally, we'd like a specific application where ed.RenyiDivergence produces better results by some metric than alternatives. IIRC, the paper had some interesting results for a Bayesian neural net on some specific UCI data sets. That would be great to have and reproduce some of their results.

If you don't have time for this, we can leave it off for now and raise it as a Github issue post-merging this PR.

dustinvtran · 2017-09-27T15:01:12Z

edward/inferences/renyi_divergence.py

+                            self.scale.get(x, 1.0)
+                            * x_copy.log_prob(dict_swap[x]))
+
+            logF = [p - q for p, q in zip(p_log_prob, q_log_prob)]


Instead of logF, what about something like log_ratios, which is more Pythonic in snake_case and also more semantically meaningful?

Replaced LogF by log_ratios Fix convention errors

jbregli · 2017-09-27T17:14:26Z

Thanks for the suggestion and the very informative feedback.

Can you include a unit test? See, e.g., how KLpq is tested under the file tests/inferences/test_klpq.py.

Will do later today.

jbregli · 2017-09-28T14:40:52Z

I've added some testing in a similar way as KLqp. (both normal_normal and the bernouilli distribution.
For each cases, I've tested most of the possible cases of the Renyi VI:
KL, VR-max, VR-min, alpha<0, alpha>0.

dustinvtran · 2017-09-28T19:26:25Z

tests/inferences/test_renyi_divergence.py

+import tensorflow as tf
+
+from edward.models import Bernoulli, Normal
+from edward.inferences.renyi_divergence import RenyiDivergence


We should check the import works by instead using ed.RenyiDivergence in the test.

Got some issue with this but it should be working now.

dustinvtran · 2017-09-28T19:27:17Z

edward/inferences/renyi_divergence.py

+  [@li2016renyi].
+
+  #### Notes
+      + The gradient estimator used here does not have any analytic version.


With Markdown formatting, you don't need the 4 spaces of indentation. E.g., you can just do

#### Notes + The gradient estimator ... + ...

dustinvtran · 2017-09-28T19:28:03Z

edward/inferences/renyi_divergence.py

+      = \frac{1}{\alpha-1} \log \int q(z)^{\alpha} p(z \mid x)^{1-\alpha} dz.$
+
+  The optimization is performed using the gradient estimator as defined in
+  [@li2016renyi].


The citekey is being used as a direct object so it should be [@li2016renyi] -> @li2016renyi.

dustinvtran · 2017-09-28T19:29:46Z

edward/inferences/renyi_divergence.py

+        + See Renyi Divergence Variational Inference [@li2016renyi] for
+        more details.
+    """
+    if self.is_reparameterizable:


is_reparameterizable should be checked with the possible raising error during the __init__, and since it's checked there it doesn't need to be stored in the class. This also helps to remove one layer of indentation in this function.

dustinvtran · 2017-09-28T19:37:58Z

edward/inferences/renyi_divergence.py

+      if self.backward_pass == 'max':
+        log_ratios = tf.stack(log_ratios)
+        log_ratios = tf.reduce_max(log_ratios, 0)
+        loss = tf.reduce_mean(log_ratios)


If I understood the code correctly, log_ratios when first created is a list of n_samples elements, where each element is a log ratio calculation per sample from q. For the min / max modes, we take the min / max of these log ratios, which is a scalar.

Is tf.reduce_mean for the loss needed? You can also remove the tf.stack line in the min and max cases in the same way you didn't use it for the self.alpha \approx 1 case.

You're right. Thanks for spotting this.

dustinvtran · 2017-09-28T19:42:20Z

examples/vae_renyi.py

+[@li2016renyi]
+
+#### Notes
+This example is almost exactly similar to example/vae.py.


Sorry for the miscommunication. What I meant was that you can edit vae.py, comment out the 1-2 lines of code to use ed.RenyiDivergence, and add these notes there. This helps to compress the content in the examples, c.f., https://github.com/blei-lab/edward/blob/master/examples/bayesian_logistic_regression.py#L51.

Removed vae_renyi.py and modifed vae.py instead.
The version of vae.py I had wasn't running though. So I've modified it quite a bit.

Are you using the latest version of Edward? We updated a few details in vae.py so it actually runs better. For example, you should be using the observations library and a generator, which is far more transparent than the mnist_data class from that TensorFlow tutorial.

In addition, since vae.py is also our canonical VAE example, I prefer keeping it as ed.KLqp as the default, and with the renyi divergence option commented out; similarly, the top-level comments should be written in-line near the renyi divergence option instead.

If you have thoughts otherwise, happy to take alternative suggestions.

dustinvtran · 2017-09-28T19:43:26Z

tests/inferences/test_renyi_divergence.py

+
+class test_renyi_divergence_class(tf.test.TestCase):
+
+  def _test_normal_normal(self, Inference, *args, **kwargs):


Since RenyiDivergence is used across all tests, you don't need Inference as an arg to the test functions.

I've used the same template as test_klpq where only KLpq is used during the tests and where Inference is stil an argument to the test functions.
But I think I have modified to be closer to what you had in mind

Merged example to VAE Simplify tests

Modify examples and test calls

jbregli · 2017-09-29T11:15:22Z

It keeps failing the travis-ci check for python 2.7 but before getting into the proper testing of my code (fail to install matplotlib and seaborn).
Anything I've done wrong on my side?

dustinvtran · 2017-09-29T17:57:33Z

Looks like this is happening in Travis on any build. I'll look into it.

jb-regli added 24 commits September 16, 2017 17:07

trying ab_divergence

5a5f8a9

adding renyi as special case

3d65841

trying ab_divergence

523210a

trying ab_divergence

4682fa7

sign error ?

0e30c18

ignore data + add renyi divergence

cc63477

cleaning

c79dbb2

docstring

397eb71

renyi divergence

005dd03

renyi examples in notebook

a38fa82

renyi examples in notebook

102453c

branch

8fdbe52

hard reset

f85d97d

hard reset

b146080

renyi divergence

d618579

renyi divergence improvement

9f9a889

Renyi divergence improvement

66b8e87

Error

17bdb8b

Moved build_loss and gradient in the class

377ff9c

Renyi exqmple + docstring

4c67eed

Merge branch 'master' of https://github.com/blei-lab/edward

5aa9a25

Merge remote-tracking branch 'origin/master' into renyi_divergence

d4f98b0

testing

c340e46

test

18fd32f

dustinvtran reviewed Sep 27, 2017

View reviewed changes

jb-regli added 4 commits September 27, 2017 15:39

Pep8 correction

8623ebc

remove irrelevant file from PR

57c5ba0

2-space indent

6fc9b8a

Markdown formated docstring

0df0215

dustinvtran reviewed Sep 27, 2017

View reviewed changes

jb-regli added 3 commits September 27, 2017 17:46

Edited docstring

671541b

Replaced LogF by log_ratios Fix convention errors

Correct order of call

9e0a3b7

Updated docstrings

821d102

Testing for Renyi VI

6de5523

dustinvtran reviewed Sep 28, 2017

View reviewed changes

jb-regli added 8 commits September 29, 2017 09:52

Add Renyi_div to shortcut

9d58f4f

Merged example to VAE Simplify tests

Correct init

bdcaa8f

Modify examples and test calls

Allow renyidivergence

dfad744

Allow quick call

e5c4867

Debug shortcut

89fe5cd

restore from edward

da83c97

Call shortcut for Renyi divergence

a9139ed

Correct style

717d236

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Renyi divergence #769

Renyi divergence #769

jbregli commented Sep 27, 2017 •

edited

Loading

dustinvtran left a comment

dustinvtran Sep 27, 2017

dustinvtran Sep 27, 2017

dustinvtran Sep 27, 2017

dustinvtran Sep 27, 2017

dustinvtran Sep 27, 2017

dustinvtran left a comment

dustinvtran Sep 27, 2017

dustinvtran Sep 27, 2017

dustinvtran Sep 27, 2017

dustinvtran Sep 27, 2017

dustinvtran Sep 27, 2017

dustinvtran Sep 27, 2017

dustinvtran Sep 27, 2017

dustinvtran Sep 27, 2017

jbregli commented Sep 27, 2017 •

edited

Loading

jbregli commented Sep 28, 2017

dustinvtran Sep 28, 2017 •

edited

Loading

jbregli Sep 29, 2017

dustinvtran Sep 28, 2017

jbregli Sep 29, 2017

dustinvtran Sep 28, 2017

jbregli Sep 29, 2017

dustinvtran Sep 28, 2017

jbregli Sep 29, 2017

dustinvtran Sep 28, 2017

jbregli Sep 29, 2017

dustinvtran Sep 28, 2017

jbregli Sep 29, 2017

dustinvtran Sep 29, 2017

dustinvtran Sep 28, 2017 •

edited

Loading

jbregli Sep 29, 2017

jbregli commented Sep 29, 2017 •

edited

Loading

dustinvtran commented Sep 29, 2017

		"{0}. Your TensorFlow version is not supported.".format(e))


		class Renyi_divergence(VariationalInference):


		class test_renyi_divergence_class(tf.test.TestCase):

		def _test_normal_normal(self, Inference, args, *kwargs):

Renyi divergence #769

Are you sure you want to change the base?

Renyi divergence #769

Conversation

jbregli commented Sep 27, 2017 • edited Loading

dustinvtran left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dustinvtran left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jbregli commented Sep 27, 2017 • edited Loading

jbregli commented Sep 28, 2017

dustinvtran Sep 28, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dustinvtran Sep 28, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jbregli commented Sep 29, 2017 • edited Loading

dustinvtran commented Sep 29, 2017

jbregli commented Sep 27, 2017 •

edited

Loading

jbregli commented Sep 27, 2017 •

edited

Loading

dustinvtran Sep 28, 2017 •

edited

Loading

dustinvtran Sep 28, 2017 •

edited

Loading

jbregli commented Sep 29, 2017 •

edited

Loading