You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for reporting this! Could you please try a few follow-on things?
Can you please try upgrading to latest Composer version (0.16.1)?
Can you please see if this same issue happens if you only run on 1 GPU?
In torch, distributed samplers duplicate data to ensure it is even across all ranks. We've added code to correct for this, but it is possible it is buggy. I'd like to ensure you're running on the latest version with these fixes and confirm it's the same issue and not something else
Environment
To reproduce
Steps to reproduce the behavior:
Expected behavior
Get same result under different batch_size
Additional context
The text was updated successfully, but these errors were encountered: