You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Using precision: 16-mixed in the trainer config results in the same issue described in #39 and #40.
File "/local_data/doserbd/miniconda3/envs/spherinator-2/lib/python3.10/site-packages/torch/cuda/amp/grad_scaler.py", line 166, in scale
105 assert outputs.is_cuda or outputs.device.type == 'xla'
We close this issue and track it further in #39 and #40.
According to my tensorboard visualization we use tensor cores in about 6% of the whole training time.... (I'll have to see if I still have the screenshot for this)
Unfortunately the tensorboard pytorch profiler plugin is depcreated and does not work as well as it used to with our pytorch 2.... currently the tool https://hta.readthedocs.io/en/latest/index.html does not seem to cover the whole functionality
Approach
Related links:
The text was updated successfully, but these errors were encountered: