How do neural networks generalize? #7
emptymalei
started this conversation in
2.Journal Club (Machine Learning)
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
There are somethings that are quite hard to understand in deep neural networks. One of them is how the network generalizes.
[Zhang2016] shows some experiments about the amazing ability to learn even completely random datasets but can not generalize as the data is random. How to understand generalization? The authors mentioned some theories like VC dimension, Rademacher complexity, and uniform stability. But none of them is good enough.
Recently, I found the work by Simon [Simon2021]. The authors also wrote a blog about this paper [Simon2021Blog].
The idea is to simplify the problem of generalization by looking at how a neural network approximates a function f. This is approximate vectors in Hilbert space. Thus we are looking at the similarity of the vectors f, and its neural network approximation f'. The similarity of these two vectors is related to the eigenvalues of the so-called “neural tangent kernel” (NTK).
Using NTK, they derived an amazingly simple quantity, learnability, which can measure how Hilbert space vectors align with each other, that is, how good the approximation using the neural network is.
[Zhang2016]: Zhang C, Bengio S, Hardt M, Recht B, Vinyals O. Understanding deep learning requires rethinking generalization. arXiv [cs.LG]. 2016. Available: http://arxiv.org/abs/1611.03530
[Simon2021Blog]: Simon J. A First-Principles Theory of NeuralNetwork Generalization. In: The Berkeley Artificial Intelligence Research Blog [Internet]. [cited 26 Oct 2021]. Available: https://bair.berkeley.edu/blog/2021/10/25/eigenlearning/
[Simon2021]: Simon JB, Dickens M, DeWeese MR. Neural Tangent Kernel Eigenvalues Accurately Predict Generalization. arXiv [cs.LG]. 2021. Available: http://arxiv.org/abs/2110.03922
cover image:
https://www.offconvex.org/2018/02/17/generalization2/
Beta Was this translation helpful? Give feedback.
All reactions