[Issue]: HIPCC: wrong kernel launch symbol interpretation #149

Vomartins · 2024-08-23T13:01:13Z

Problem Description

I am a collaborator on the OPM project (https://opm-project.org/) and am currently working on offloading the well solver portion of the code to AMD GPUs. One of OPM's prerequisites is DUNE (https://dune-project.org/). During OPM compilation with HIPCC, I encountered the following error:

This appears to be a misinterpretation, where HIPCC is likely interpreting the output operator overload as the HIP kernel launch chevron symbol. I added a blank space between '<<' and '<>', and the compilation completed successfully. It would be ideal if HIPCC could distinguish that, in this context, it is not the chevron symbol.

Operating System

Debian GNU/Linux 12 (bookworm)

CPU

AMD Ryzen 9 7900 12-Core Processor

GPU

AMD Radeon Pro W7900

ROCm Version

ROCm 6.1.0

ROCm Component

HIPCC

Steps to Reproduce

Install ROCm;
Clone OPM repositories (https://opm-project.org/?page_id=231);
Following the instructions from the link above except for opm-simulators;
Make sure that ROCm path is present in the environment variables, in PATH or with CMAKE_PREFIX_PATH;
For opm-simulators configure the build with the option -DCMAKE_CXX_COMPILER=hipcc.

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

ROCk module version 6.7.0 is loaded

HSA System Attributes

Runtime Version: 1.13
Runtime Ext Version: 1.4
System Timestamp Freq.: 1000.000000MHz
Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model: LARGE
System Endianness: LITTLE
Mwaitx: DISABLED
DMAbuf Support: YES

==========
HSA Agents

Agent 1

Name: AMD Ryzen 9 7900 12-Core Processor
Uuid: CPU-XX
Marketing Name: AMD Ryzen 9 7900 12-Core Processor
Vendor Name: CPU
Feature: None specified
Profile: FULL_PROFILE
Float Round Mode: NEAR
Max Queue Number: 0(0x0)
Queue Min Size: 0(0x0)
Queue Max Size: 0(0x0)
Queue Type: MULTI
Node: 0
Device Type: CPU
Cache Info:
L1: 32768(0x8000) KB
Chip ID: 0(0x0)
ASIC Revision: 0(0x0)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 5482
BDFID: 0
Internal Node ID: 0
Compute Unit: 24
SIMDs per CU: 0
Shader Engines: 0
Shader Arrs. per Eng.: 0
WatchPts on Addr. Ranges:1
Features: None
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: FINE GRAINED
Size: 65547084(0x3e82b4c) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 2
Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED
Size: 65547084(0x3e82b4c) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 3
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 65547084(0x3e82b4c) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
ISA Info:

Agent 2

Name: gfx1100
Uuid: GPU-07071a9c2b9e8ff7
Marketing Name: AMD Radeon PRO W7900
Vendor Name: AMD
Feature: KERNEL_DISPATCH
Profile: BASE_PROFILE
Float Round Mode: NEAR
Max Queue Number: 128(0x80)
Queue Min Size: 64(0x40)
Queue Max Size: 131072(0x20000)
Queue Type: MULTI
Node: 1
Device Type: GPU
Cache Info:
L1: 32(0x20) KB
L2: 6144(0x1800) KB
L3: 98304(0x18000) KB
Chip ID: 29768(0x7448)
ASIC Revision: 0(0x0)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 1760
BDFID: 768
Internal Node ID: 1
Compute Unit: 96
SIMDs per CU: 2
Shader Engines: 6
Shader Arrs. per Eng.: 2
WatchPts on Addr. Ranges:4
Coherent Host Access: FALSE
Features: KERNEL_DISPATCH
Fast F16 Operation: TRUE
Wavefront Size: 32(0x20)
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Max Waves Per CU: 32(0x20)
Max Work-item Per CU: 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
Max fbarriers/Workgrp: 32
Packet Processor uCode:: 202
SDMA engine uCode:: 20
IOMMU Support:: None
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 47169536(0x2cfc000) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:2048KB
Alloc Alignment: 4KB
Accessible by all: FALSE
Pool 2
Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED
Size: 47169536(0x2cfc000) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:2048KB
Alloc Alignment: 4KB
Accessible by all: FALSE
Pool 3
Segment: GROUP
Size: 64(0x40) KB
Allocatable: FALSE
Alloc Granule: 0KB
Alloc Recommended Granule:0KB
Alloc Alignment: 0KB
Accessible by all: FALSE
ISA Info:
ISA 1
Name: amdgcn-amd-amdhsa--gfx1100
Machine Models: HSA_MACHINE_MODEL_LARGE
Profiles: HSA_PROFILE_BASE
Default Rounding Mode: NEAR
Default Rounding Mode: NEAR
Fast f16: TRUE
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
FBarrier Max Size: 32

Additional Information

compile-log.txt
configuration-log.txt
hip-config.txt
config-h-cmake.txt

The text was updated successfully, but these errors were encountered:

Vomartins changed the title ~~[Issue]: HIPCC wrong kernel launch symbol interpretation~~ [Issue]: HIPCC: wrong kernel launch symbol interpretation Aug 23, 2024

lamb-j added the hipcc Related to HIPCC label Oct 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Issue]: HIPCC: wrong kernel launch symbol interpretation #149

[Issue]: HIPCC: wrong kernel launch symbol interpretation #149

Vomartins commented Aug 23, 2024

[Issue]: HIPCC: wrong kernel launch symbol interpretation #149

[Issue]: HIPCC: wrong kernel launch symbol interpretation #149

Comments

Vomartins commented Aug 23, 2024

Problem Description

Operating System

CPU

GPU

ROCm Version

ROCm Component

Steps to Reproduce

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

ROCk module version 6.7.0 is loaded

HSA System Attributes

========== HSA Agents

Additional Information

==========
HSA Agents