You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I encountered a weird crash. Not really sure what happened, because it ran fine for 29 epochs before that. I can't reproduce it, but I wanted to keep it here for future reference.
- - - - - - - - - - - - Epoch 30 - - - - - - - - - - - -
[====1====2====3=validation
total_loss : 122.73372146606445
Accuracy : 0.6446704
===4===PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuMemFree failed: an illegal memory access was encountered
ERROR - diag_hutter - Failed after 2 days, 7:11:11!
/home/greff/Programming/sacred/sacred/observers/mongo.py:110: DeprecationWarning: save is deprecated. Use insert_one or replace_one instead
self.runs.save(self.run_entry)
Traceback (most recent calls WITHOUT Sacred internals):
File "diag_hutter.py", line 87, in run
trainer.train(network, getter_tr, valid_getter=getter_va)
File "/home/greff/Programming/brainstorm/brainstorm/training/trainer.py", line 99, in train
self.stepper.run()
File "/home/greff/Programming/brainstorm/brainstorm/training/steppers.py", line 103, in run
self.net.forward_pass(training_pass=True)
File "/home/greff/Programming/brainstorm/brainstorm/structure/network.py", line 431, in forward_pass
layer.forward_pass(self.buffer[layer_name], training_pass)
File "/home/greff/Programming/brainstorm/brainstorm/layers/loss_layer.py", line 46, in forward_pass
buffers.outputs.loss.reshape(tuple()))
File "/home/greff/Programming/brainstorm/brainstorm/handlers/pycuda_handler.py", line 378, in sum_t
self.copy_to(cumisc.sum(a), out)
File "/home/greff/Programming/brainstorm/brainstorm/handlers/pycuda_handler.py", line 84, in copy_to
pycuda.driver.memcpy_dtod(dest.gpudata, src.gpudata, dest.nbytes)
Traceback (most recent call last):
File "/home/greff/Programming/sacred/sacred/experiment.py", line 165, in run_commandline
args)
File "/home/greff/Programming/sacred/sacred/experiment.py", line 138, in run_command
run()
File "/home/greff/Programming/sacred/sacred/run.py", line 144, in __call__
self.result = self.main_function(*args)
File "/home/greff/Programming/sacred/sacred/config/captured_function.py", line 47, in captured_function
result = wrapped(*args, **kwargs)
File "diag_hutter.py", line 87, in run
trainer.train(network, getter_tr, valid_getter=getter_va)
File "/home/greff/Programming/brainstorm/brainstorm/training/trainer.py", line 99, in train
self.stepper.run()
File "/home/greff/Programming/brainstorm/brainstorm/training/steppers.py", line 103, in run
self.net.forward_pass(training_pass=True)
File "/home/greff/Programming/brainstorm/brainstorm/structure/network.py", line 431, in forward_pass
layer.forward_pass(self.buffer[layer_name], training_pass)
File "/home/greff/Programming/brainstorm/brainstorm/layers/loss_layer.py", line 46, in forward_pass
buffers.outputs.loss.reshape(tuple()))
File "/home/greff/Programming/brainstorm/brainstorm/handlers/pycuda_handler.py", line 378, in sum_t
self.copy_to(cumisc.sum(a), out)
File "/home/greff/Programming/brainstorm/brainstorm/handlers/pycuda_handler.py", line 84, in copy_to
pycuda.driver.memcpy_dtod(dest.gpudata, src.gpudata, dest.nbytes)
pycuda._driver.LogicError: cuMemcpyDtoD failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuModuleUnload failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuModuleUnload failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuModuleUnload failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuModuleUnload failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuModuleUnload failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuModuleUnload failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuMemFree failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuMemFree failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuMemFree failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuMemFree failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuModuleUnload failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuMemFree failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuModuleUnload failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuModuleUnload failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuModuleUnload failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuModuleUnload failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuModuleUnload failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuModuleUnload failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuModuleUnload failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuModuleUnload failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuModuleUnload failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuModuleUnload failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuModuleUnload failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuModuleUnload failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuModuleUnload failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuModuleUnload failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuModuleUnload failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuModuleUnload failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuModuleUnload failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuModuleUnload failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuModuleUnload failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuModuleUnload failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuModuleUnload failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuModuleUnload failed: an illegal memory access was encountered
sys:1: ResourceWarning: unclosed <socket.socket fd=23, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 35871), raddr=('127.0.0.1', 5006)>
/home/greff/venv/py3/lib/python3.4/importlib/_bootstrap.py:2150: ImportWarning: sys.meta_path is empty
The text was updated successfully, but these errors were encountered:
Our usage of skcuda.sum() was rather weird, due to a bug in scikit-cuda. The fix above will hopefully avoid such crashes. Let's keep this open for a while to see if the bug re-appears.
flukeskywalker
changed the title
PyCudaHandler: Illegal memory access
PyCudaHandler: Illegal memory access (cuMemcpyDtoD failed)
Nov 23, 2015
I encountered a weird crash. Not really sure what happened, because it ran fine for 29 epochs before that. I can't reproduce it, but I wanted to keep it here for future reference.
The text was updated successfully, but these errors were encountered: