-
Notifications
You must be signed in to change notification settings - Fork 907
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi-GPUS support #152
base: master
Are you sure you want to change the base?
Multi-GPUS support #152
Conversation
@begeekmyfriend I have not modified the relative code in terms of the pattern. |
@Rayhane-mamah Yes I agree. In multi-gpu mode we can set |
Yes it seems like people are requesting that. :) well, your multi-gpu attempt @MlWoo is sure much helpful. Since the model content has been changed since you made this implementation, I will need to make few updates here and there, but yeah, I will probably make a new branch for both Wavenet and Tacotron multi-gpu or add those directly on master with optional use or something. (I don't like 4 spaces though hahaha..). In the meantime, I am leaving this PR open in here so that people can quickly refer to a good multi-gpu implementation :) Thanks for all your contributions @MlWoo and @begeekmyfriend ;) |
When I try to use this Fork as it is, I run into the following: ValueError: Cannot feed value of shape (48, 408, 1025) for Tensor 'datafeeder/linear_targets:0', which has shape '(?, ?, 513)' What could be the cause of this? I preprocessed LJSpeech with the given hyperparameters btw. |
@tomse-h I have not modified the relative code in terms of the linear pattern. You can complete it with the solution of mel features |
I might be a bit late into this conversation, but did you guys also see a proportional increase in sec/step when using multiple GPUs? Here are my stats on V100 GPUs with outputs_per_step = 16 |
@shaktikshri No, it increases but does scale linearly. You would better check the time of loading data and the unbalance of length of data of each device. |
Many friends seem very to be interested in multi-gpus support when training the model. Maybe it is necessary to merge the branch into the master one.