Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forward Forward Algorithm Questions #352

Open
and-rewsmith opened this issue Aug 20, 2023 · 2 comments
Open

Forward Forward Algorithm Questions #352

and-rewsmith opened this issue Aug 20, 2023 · 2 comments

Comments

@and-rewsmith
Copy link

and-rewsmith commented Aug 20, 2023

Hi @diegofiori, I am conducting some research for the Allen Institute on the recurrent Forward Forward model based on Hinton’s approach. I am attempting to extend his work with the following:

  1. Inverting the objective function to be more biologically plausible, and to show more similarity with predictive coding.
  2. Hiding the label for the first few timesteps, playing into the concept of predictive coding. (i.e. high activations initially, followed by low activations in case of successfully predicted samples)
  3. Supporting sparse connectivity between layers, playing into concept of modularity / biological plausibility.
  4. It was unclear if Hinton actually implemented the recurrent connections, as the network diagram he provided was copied from his GLOM paper. But I did implement these connections.

My architecture performs on MNIST at about 94% test accuracy. Hinton reports that he got 99%+. I am curious, did you achieve SOTA performance on the recurrent model? If so I have some follow up questions.

My project is here:
https://github.com/and-rewsmith/RecurrentForwardForward

@and-rewsmith
Copy link
Author

@valeriosofi Do you happen to know the answer to this?

My architecture performs on MNIST at about 94% test accuracy. Hinton reports that he got 99%+. I am curious, did you achieve SOTA performance on the recurrent model? If so I have some follow up questions.

@and-rewsmith
Copy link
Author

I took a closer look at the code and it seems like your architecture isn't like figure 3 from the paper.
image

Instead it seems like your implementation tracks the last few activities which are used in every layer's forward pass, and the net is otherwise the same as the forward only Forward Forward network.

Still, curious about the accuracy of this approach compared to your other variations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant