You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm able to use this package with CUDA Arrays, since the allowed_getindex (for example used here) does not handle the indexing error of the CuArrays. However this slow down the computation, since this process would be transfered and than executed on the CPU.
Is it possible, for example in that for loop I linked, to avoid using the allowed_getindex function? It would be a very good improvement, not only for GPU calculations.
The text was updated successfully, but these errors were encountered:
I mean, it would make it faster to remove it, but it would also make it incorrect. Do you have an idea for how to do it with also keeping the correctness? You do have to somehow change the value by epsilon and put it back, and that will require a kernel call each time.
Hello,
I'm able to use this package with CUDA Arrays, since the
allowed_getindex
(for example used here) does not handle the indexing error of the CuArrays. However this slow down the computation, since this process would be transfered and than executed on the CPU.Is it possible, for example in that for loop I linked, to avoid using the
allowed_getindex
function? It would be a very good improvement, not only for GPU calculations.The text was updated successfully, but these errors were encountered: