You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
I have run cuda-memcheck on this project, but it showed errors on initcheck while compressing data and warnings on racecheck while decompressing data. I was running on Tesla V100-SXM2(32GB) and A100-SXM4(40GB) with cuda-11.2. I'm wondering whether it is a false alarm or not. Below are the examples of the errors and warnings and how I removed them.
The initcheck error can be removed by adding CHECKED_CUDA_CALL(cudaMemset, _memory, 0,size * sizeof(T)); in cuda_bits.cuh (line 196 and 216).
Here is an example of initcheck error:
For the racecheck, warnings can be removed by adding __syncwarp() in cuda_encoder.inl after line 355. I think the warnings are because that some threads might enter line 323 in the second iteration while the other threads are still at line 345 in the first iteration after the __syncwarp() in line 326 ?
Here is the example of racecheck warning:
========= WARN: Race reported between Read access at 0x00000f20 in void ndzip::detail::gpu_cuda::decompress_block<ndzip::detail::profile<float, unsigned int=1>>(floatbits_type const *, ndzip::slice<ndzip::detail::gpu_cuda::decompress_block<ndzip::detail::profile<float, unsigned int=1>::data_type>, __scope__(dimensions)>)
========= and Write access at 0x00000b30 in void ndzip::detail::gpu_cuda::decompress_block<ndzip::detail::profile<float, unsigned int=1>>(floatbits_type const *, ndzip::slice<ndzip::detail::gpu_cuda::decompress_block<ndzip::detail::profile<float, unsigned int=1>::data_type>, __scope__(dimensions)>) [21864 hazards]
=========
Here is the code change
--- ndzip/src/ndzip/cuda_encoder.inl
+++ ndzip/src/ndzip/cuda_encoder.inl
@@ -353,6 +353,7 @@
__builtin_memcpy(&row_bits, row, sizeof row_bits);
hc.store(item, row_bits);
}
+ __syncwarp();//<-------ADD
} else {
// TODO duplication of the `item` calculation above. The term can be simplified!
for (index_type w = 0; w < warps_per_col_chunk; ++w) {
Thanks!
The text was updated successfully, but these errors were encountered:
annymao
changed the title
cuda-memcheck, initcheck and racecheck failed
cuda-memcheck: initcheck and racecheck failed
Feb 15, 2022
Hi,
I have run cuda-memcheck on this project, but it showed errors on initcheck while compressing data and warnings on racecheck while decompressing data. I was running on Tesla V100-SXM2(32GB) and A100-SXM4(40GB) with cuda-11.2. I'm wondering whether it is a false alarm or not. Below are the examples of the errors and warnings and how I removed them.
The initcheck error can be removed by adding
CHECKED_CUDA_CALL(cudaMemset, _memory, 0,size * sizeof(T));
in cuda_bits.cuh (line 196 and 216).Here is an example of initcheck error:
Here is the code change
For the racecheck, warnings can be removed by adding
__syncwarp()
in cuda_encoder.inl after line 355. I think the warnings are because that some threads might enter line 323 in the second iteration while the other threads are still at line 345 in the first iteration after the __syncwarp() in line 326 ?Here is the example of racecheck warning:
Here is the code change
Thanks!
The text was updated successfully, but these errors were encountered: