diff --git a/content/2.01_DeviceQuery.rst b/content/2.01_DeviceQuery.rst index 6f85647..2f5bdd0 100644 --- a/content/2.01_DeviceQuery.rst +++ b/content/2.01_DeviceQuery.rst @@ -14,9 +14,19 @@ First, we want to ask API how many CUDA+capable devices are available, which is .. signature:: |cudaGetDeviceCount| - .. code-block:: CUDA - - __host__ ​__device__​ cudaError_t cudaGetDeviceCount(int* numDevices) + .. tabs:: + + .. tab:: CUDA + + .. code-block:: c++ + + __host__ __device__ cudaError_t cudaGetDeviceCount(int* numDevices) + + .. tab:: HIP + + .. code-block:: c++ + + __host__ __device__ cudaError_t cudaGetDeviceCount(int* numDevices) The function calls the API and returns the number of the available devices in the address provided as a first argument. There are a couple of things to notice here. @@ -36,19 +46,42 @@ To populate the |cudaDeviceProp| structure, CUDA has |cudaGetDeviceProperties| f .. signature:: |cudaGetDeviceProperties| - .. code-block:: c++ + .. tabs:: + + .. tab:: CUDA + + .. code-block:: c++ + + __host__ cudaError_t cudaGetDeviceProperties(cudaDeviceProp* prop, int deviceId) + + .. tab:: HIP + + .. code-block:: c++ + + __host__ cudaError_t cudaGetDeviceProperties(cudaDeviceProp* prop, int deviceId) - __host__​ cudaError_t cudaGetDeviceProperties(cudaDeviceProp* prop, int deviceId) The function has a |__host__| specifier, which means that one can not call it from the device code. It also returns |cudaError_t| structure, which can be |cudaErrorInvalidDevice| in case we are trying to get properties of a non-existing device (e.g. when ``deviceId`` is larger than ``numDevices`` above). The function takes a pointer to the |cudaDeviceProp| structure, to which the data is saved and an integer index of the device to get the information about. The following code should get you an information on the first device in the system (one with ``deviceId = 0``). -.. code-block:: c++ - cudaGetDeviceProp prop; - cudaGetDeviceProperties(&prop, 0); +.. tabs:: + + .. tab:: CUDA + + .. code-block:: c++ + + cudaGetDeviceProp prop; + cudaGetDeviceProperties(&prop, 0); + + .. tab:: HIP + + .. code-block:: c++ + + cudaGetDeviceProp prop; + cudaGetDeviceProperties(&prop, 0); Exercise -------- diff --git a/content/4.01_FromCUDAToHIP.rst b/content/4.01_FromCUDAToHIP.rst new file mode 100644 index 0000000..1ef9343 --- /dev/null +++ b/content/4.01_FromCUDAToHIP.rst @@ -0,0 +1,12 @@ +.. _cuda_to_hip: + +From CUDA to HIP +================ + +1. Performance and performance portability +------------------------------------------ + +2. Tools for HIPification +------------------------- + + diff --git a/content/conf.py b/content/conf.py index 2fe3a21..ff936dc 100644 --- a/content/conf.py +++ b/content/conf.py @@ -17,7 +17,7 @@ # -- Project information ----------------------------------------------------- -project = "CUDA training materials" +project = "Heterogeneous programming with CUDA/HIP" copyright = "2021, Artem Zhmurov and individual contributors." author = "Artem Zhmurov and individual contributors." github_user = "ENCCS" diff --git a/content/index.rst b/content/index.rst index bfc57b2..e300f9e 100644 --- a/content/index.rst +++ b/content/index.rst @@ -1,5 +1,5 @@ -CUDA training -============= +Heterogeneous programming with CUDA/HIP +======================================= Intro @@ -22,6 +22,7 @@ Intro 2.04_HeatEquation 3.01_ParallelReduction 3.02_TaskParallelism + 4.01_FromCUDAToHIP .. toctree::