-
Notifications
You must be signed in to change notification settings - Fork 329
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix RISC-V Linux build #212
Conversation
Cpuinfo was failing to build on RISC-V Linux distributions, e.g., Ubuntu 23.10, as it includes a header file sys/hwprobe.h that is not yet provided by glibc (although it is provided by bionic). We fix the issue by only including sys/hwprobe.h when building for Android, and invoking the hwprobe syscall directly on other Linux distributions. The Android specific check can be removed in the future once sys/hwprobe.h becomes available in glibc.
@prashanthswami Could I ask you to verify that this patch works correctly on Android and that it doesn't change the hwprobe behaviour on Android builds in any way. It shouldn't do but I don't have a way to test this. |
Confirming - I can build this for Android RISC-V and verified that cpuinfo and isainfo both return properly. I also built against a toolchain that uses an older kernel header and it successfully built. |
Great. Thanks. Could you also possibly confirm that we're still taking the bionic path on Android, e.g., that we're executing __riscv_hwprobe here? |
Yup - in the previous run, I had added some logs to ensure that Android was running through the hwprobe path to make sure we weren't falling back to the direct-syscall path. |
Thanks! |
@malfet, @prashanthswami what is the process for getting this PR reviewed and merged? I'm a little concerned that if we leave it much longer pytorch itself might stop building on standard Linux distributions of RISC-V. This could happen if pytorch were to update its submodule to include #190 but not this patch. Pytorch seems to have updated its cpuinfo submodule recently, but luckily it was updated to point to the patch before #4e5be9e. The next update will break the RISC-V builds unless this patch is merged. |
@markdryan - there's some context in #210 but TL;DR this project doesn't have an official maintainer at the moment and malfet doesn't have the bandwidth or hardware setup to review patches. I wasn't able to get into the Slack so I've started a conversation on the PyTorch discussion forums here, mentioning this patch as something that's eventually going to break a future roll: https://discuss.pytorch.org/t/maintenance-of-pytorch-cpuinfo/194722/1 I'll defer to Nikita to comment on what or if we can do anything in the meanwhile (though I'm happy to lend help to validate any patches) - if possible I'd like to see if we can land this just to avoid the breakage and then pause any future work until we figure out how this project gets resourced with the PyTorch group. |
@prashanthswami Can we start an email thread instead? Also, what error are you seeing when trying to join slack? |
Ah sorry - "I wasn't able to get in" was a poor choice of words, I meant "I have not yet gotten an invitation link". I have the Google form confirmation of the request from the beginning of December, but haven't gotten feedback on a yes/no. Of course, joining the Slack is only for the purpose of the discussion thread. I was redirected to post in https://dev-discuss.pytorch.org/t/maintenance-of-pytorch-cpuinfo/1767 - I'm happy to start an email thread if that's more appropriate though. What alias should be included from the PyTorch team? |
@prashanthswami feel free to just email me (nshulga at meta) and I'll add the respective folks
|
Unfortunately, skipping if the headers are not available is not really an option, as they are likely to be unavailable on build machines for the foreseeable future. The sys/hwprobe.h header file does not yet exist on any glibc based Linux distributions as the glibc patches for hwprobe have not yet been merged. This is why cpuinfo does not currently build on RISC-V Ubuntu and Fedora distributions. On these distributions we could try including the kernel header file asm/hwprobe.h instead (which is the source of the constants in this PR), but there's another problem. The asm/hwprobe.h file will only exist if the build machine is running a 6.4 kernel or greater. As none of the RISC-V Ubuntu LTS releases have a 6.4 kernel (Ubuntu 22.04 has a 5.19 kernel) and cpuinfo binaries are likely to be built on the oldest distribution possible (x86_64 pytorch builds seem to be done on manylinux_2014 which is based on Centos 7 (which as far as I can tell has a 3.10 kernel)) it's unlikely that asm/hwprobe.h is going to be available at build time for a while, which would mean cpuinfo binaries would not be able to take advantage of hwprobe even when running on devices with the latest kernel, that support hwprobe. There's also another complication, which admittedly is more of a problem for a future patch that uses hwprobe to detect extensions, but is worth mentioning in the context of this discussion. The hwprobe API and headers files are being updated with each new kernel release. Support for Zba, Zbb, Zbs an V was added in the 6.5 kernel. Support for Zicboz was added in 6.7 and patches are pending to add support for many other extensions. Each new kernel version adds new constants to the asm/hwprobe.h. So let's say you build cpuinfo on a distribution with a 6.4 kernel. The asm/hwprobe.h will be available but it won't contain the constants for Zba, Zbb, Zbs, V or Zicboz. So if you want to use hwprobe to detect these extensions and you want your code to be buildable on a wide range of distributions or just older distributions, you have no option but to include the definitions for the constants in your code. And it's worth pointing out here that if you want to detect things like Zba, Zbb, Zbs and Zicboz and other multi-letter extensions, of which there are many, you have no option but to use hwprobe. You cannot simply fall back to hwcap if the hwprobe header files are not available, as hwcap cannot detect multi-letter extensions. For all these reasons other projects that use hwprobe tend to duplicate the kernel constants in their own code. Also note that there is a precedent for including constants from kernel header files in the cpuinfo code. See
In summary then
Right, I wasn't sure about this bit. Using __has_include seemed to be the simplest solution. I'll update the patch to use CMake to check for the hwprobe headers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it unblock you (and affects only RiscV), let's land it as is and fix forward
Cpuinfo was failing to build on RISC-V Linux distributions, e.g., Ubuntu 23.10, as it includes a header file sys/hwprobe.h that is not yet provided by glibc (although it is provided by bionic). We fix the issue by only including sys/hwprobe.h when building for Android, and invoking the hwprobe syscall directly on other Linux distributions. The Android specific check can be removed in the future once sys/hwprobe.h becomes available in glibc.