Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Issue]: Can't link to functions with vector types between HIP and VS projects #3659

Open
ET3D-ET3D opened this issue Nov 5, 2024 · 8 comments

Comments

@ET3D-ET3D
Copy link

ET3D-ET3D commented Nov 5, 2024

Problem Description

I'm using HIP 6.1.2 on Windows with Visual Studio 2019.

I have a HIP project (converted from CUDA) which creates a static library. The library contains the kernels and shallow functions to call them. I also have a normal executable Visual C++ project which calls the functions which runs the kernel. (It's actually more complex, but that's the gist of it.)

When the calling functions contain a vectory type, such as float2 or int3, linkage fails. It looks like these types are defined on one side as (for example) HIP_vector_type<float,2> and on the other side as union. I'm not sure where this comes from or how to deal with it.

Edit: Looking further into amd_hip_vector_types.h, it looks like both definitions come from there, but I'm not sure how to get one or the other.

Operating System

Windows 11 23H2

CPU

Ryzen 7435HS

GPU

Radeon 7600S

ROCm Version

ROCm 6.1.0

ROCm Component

HIP

Steps to Reproduce

No response

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

No response

Additional Information

Note that the ROCm version I selected is irrelevant. I installed HIP 6.1.2 in Windows.

@ppanchad-amd
Copy link

Hi @ET3D-ET3D. Internal ticket has been created to investigate your issue. Thanks!

@schung-amd
Copy link

Hi @ET3D-ET3D, can you provide the error(s) you're running into, and some code to reproduce this if possible? There are some known issues with HIP vector types, but not sure if your issue matches these.

@ET3D-ET3D
Copy link
Author

I wasn't able to easily reproduce it. I will play with it some more. It would be helpful to know how to select either of the two representations or what causes HIP to choose one or the other.

@ET3D-ET3D
Copy link
Author

ET3D-ET3D commented Nov 7, 2024

So far what I understand is that this is affected by __has_attribute being defined in clang but not in msvc (though again, I haven't managed to reproduce it in a simple sample yet). If I try to undef __has_attribute, it fails in math_fwd.h, because that assumes that a combination of __clang__ and __HIP__ means that HIP_vector_base exists, rather than checking on __has_attribute . ockl_image.h also assumes that float4::Native_vec_ exists without checking anything.

Edit: texture_fetch_functions.h also uses HIP_vector_type without checking anything.

@yxsamliu
Copy link
Contributor

yxsamliu commented Nov 7, 2024

Usually the kernels are expected to be launched by HIP programs that are compiled by clang or hipcc.

How do you launch the kernels in a C++ program compiled by MSVC?

@ET3D-ET3D
Copy link
Author

Each kernel has a caller function. These are compiled as part of the HIP project. Those functions are then called by the MSVC program. The linkage problem happens when these caller functions have vector arguments, as these are different in the HIP project and MSVC project.

@yxsamliu
Copy link
Contributor

It seems https://github.com/ROCm/clr/tree/amd-staging/hipamd/include/hip/amd_detail needs to be fixed to make the vector type mangling consistent across clang and MSVC. Similar issues also happen between clang and gcc.

Basically for clang float3 is defined as a specialization of template struct HIP_vector_type

template<typename T, unsigned int rank>
    struct HIP_vector_type {};
using float3 = HIP_vector_type<float, 3>;

whereas for MSVC float3 is defined as a union

typedef union {
  struct {
    float x;
    float y;
    float z;
  };
  float data[3];
} float3;

They have different mangling on Windows https://godbolt.org/z/hacGMovv6 https://godbolt.org/z/ehzM9dcEc

As a result, any functions with float3 etc as arguments will have different mangled names in objects generated by clang and MSVC, which causes them not callable across objects compiled by different compilers.

The fix is to make them have the same mangled name for clang, gcc and MSVC. Note using and typedef does not change the mangled name, so the real type has to be consistent. One possible fix might be to use clang's definition for all compilers, with minor adjustment for gcc and MSVC.

@ET3D-ET3D
Copy link
Author

I think that the idea of using HIP_vector_type for all compilers is a good one.

Though I want to mention that it did conflict with my use of CUDA's helper_math.h (part of its sample code which I found helpful to use). HIP_vector_type defines some of the operators defined in that header, which I had to make __host__ only, rather than __host__ __device__. Which likely means that I'll have to use different definitions for the NVIDIA and AMD sides. This is not a problem with the union defs. Still, not a big deal, HIP_vector_type feels more expressive and more along the lines of how I personally tend to code things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants