-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Loading Caffe models #52
Comments
This requires conversion from models for NCHW format (Caffe) to those for NHWC (Brainstorm), so it's not straightforward, but should still be possible. |
I have some experience with this - I started with this same goal for Keras that took many turns resulting in things like the Meanwhile, if I could hijack this issue, is there any design document that explains some of the design choices you made? Just to get a better understanding of your goals. |
Cool, looking forward to it! NHWC layout makes things like this a bit trickier, but we think it's the better format for the long run. Plus, cuDNN v4 will fully support it soon :) We will indeed provide details about the design choices in brainstorm soon (beginning next week). If you get curious in the meantime, you may ask us questions on the mailing list. |
I could complete code for keras/theano conversion [PR: #921 on keras repo]. Most of it can be reused here, though I would like to know what would be best. The code I wrote there is really generic, takes in any Caffe Network and converts it to a equivalent DAG and then loads the weights. There is a lot of tiny things that are taken care of, for this to happen, making the process really complicated and cumbersome to follow. Unlike my approach, Chainer devs decided to just support available BVLC models (it could work for OxfordNet though), which are are simple sequential models. This reduces the size of code by half, and makes it a lot more easier and quicker for someone who wants to understand what is going on. What would you guys prefer to have? |
Does Keras also use NHWC? We'd like to have a more general approach (full DAG). It's fine to start with handling simpler cases, with extensibility in mind. Brainstorm also works with DAGs. The difference in connecting layers (compared to Caffe) is that every layer in Brainstorm uses inputs and outputs with fixed names (except the Side note: We are working on explaining the design in the |
Theano uses NCHW (bc01) layout. Thanks for the docs, things are making more sense now.. |
Since some papers have made available pre-trained Caffe convnets, it'd be nice to be able to use them in Brainstorm.
The text was updated successfully, but these errors were encountered: