-
Notifications
You must be signed in to change notification settings - Fork 329
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve support for RISC-V architecture on Linux #148
base: main
Are you sure you want to change the base?
Conversation
Hi @GlassOfWhiskey! Thank you for your pull request and welcome to our community. Action RequiredIn order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you. ProcessIn order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA. Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with If you have received this in error or have any questions, please contact us at [email protected]. Thanks! |
Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks! |
Hi, |
ping. any update on when this will land? cc: @Maratyszcza |
src/riscv/linux/cpuinfo.c
Outdated
processor->features |= CPUINFO_RISCV_LINUX_FEATURE_F; | ||
#endif | ||
} else if(*feature == 'i') { | ||
#if CPUINFO_ARCH_RISCV |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this line not have CPUINFO_ARCH_RISCV64
?
torvalds/linux@16252e0 maybe helpful in revising this patch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reformat consistently with the rest of cpuinfo. Use tabs for indentation.
f0b2e44
to
876be35
Compare
f0f4bda
to
375112e
Compare
@malfet I've seen that a recent PR (#190) on RISC-V support has been merged into the main branch. However, I think this PR can still be useful for some reasons:
It is still not clear to me why @prashanthswami chose to open a sibling PR instead of converging the efforts on a single one. However, I tried to manually merge #190 into this PR, applying some minor modifications when deemed necessary. |
@GlassOfWhiskey can you please amend the title and expand description to nightlight what this PR is trying to accomplish. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Please stick to the same formatting rules
- Refactor
cpuinfo_smallfile_callback
into the separate PR
@@ -88,7 +88,7 @@ inline static bool is_whitespace(char c) { | |||
static const uint32_t default_max_processors_count = CPU_SETSIZE; | |||
#endif | |||
|
|||
static bool uint32_parser(const char* text_start, const char* text_end, void* context) { | |||
static bool uint32_parser(const char* filename, const char* text_start, const char* text_end, void* context) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense, but do you mind submitting a separate PR to accomplish that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created #211 with these improvements. I'll rebase as soon as it will be merged.
@GlassOfWhiskey - I sincerely apologize, this was poor GitHub practice on my part. I had written my PR a few months ago on an internal fork and needed it up streamed somewhat quickly, so I pushed it without analyzing the open PRs. When I looked at this issue last, it hadn't been active for a few months, so I was worried it would be inappropriate to ping here and ask for an accelerated landing, just to satisfy my own project timelines. In hindsight, I see that it would have been courteous to message here and coordinate, or CC you on the other PR. I'd like to make sure I don't do this again - should I CC you on any open PRs I have for RISC-V contributions to this repo? |
#else | ||
return false; | ||
#endif | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this removed because it's not something folks need to switch on? If so, I'm also curious if we want to take this opportunity to remove everything that's not optional. I was a bit over-zealous when adding all the extensions, but when I was chatting about this with some folks they pointed out that the likelihood of anyone showing up with a chipset that doesn't support IMAFD is basically zero. Should we just limit this whole structure to 'c' and 'v' for now?
I'm not too strongly opinionated about this, but more surface area = more maintenance in the long-term.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed this because it was not used in any place, and in particular it was never captured by the cpuinfo_riscv_linux_decode_isa_from_hwcap
method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to see everything removed that is not immediately useful. Attempting to run without 'base isa' is not useful.
src/riscv/linux/init.c
Outdated
/* Populate cache information structures in l2 */ | ||
uint32_t l2_index = UINT32_MAX; | ||
for (uint32_t processor = 0; processor < valid_processors_count; processor++) { | ||
struct cpuinfo_cache dummy_l1i, dummy_l1d, temp_l2 = { 0 }; | ||
cpuinfo_riscv_decode_cache( | ||
riscv_linux_processors[processor].core.uarch, | ||
&dummy_l1i, &dummy_l1d, &temp_l2); | ||
|
||
if (temp_l2.size != 0) { | ||
if (riscv_linux_processors[processor].package_leader_id == riscv_linux_processors[processor].processor.linux_id) { | ||
l2_index += 1; | ||
l2[l2_index] = (struct cpuinfo_cache) { | ||
.size = temp_l2.size, | ||
.associativity = temp_l2.associativity, | ||
.sets = temp_l2.sets, | ||
.partitions = temp_l2.partitions, | ||
.line_size = temp_l2.line_size, | ||
.flags = temp_l2.flags, | ||
.processor_start = processor, | ||
.processor_count = riscv_linux_processors[processor].package.processor_count, | ||
}; | ||
} | ||
processors[processor].cache.l2 = l2 + l2_index; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't we do this in the loop above, right where we commit the L1 information?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No because we don't know the value of l2_count
. THe program first computes how many l2
caches are present, then allocates the l2
data structure and populates it. We can change the logic but it will behave diffferently from what it does now.
/* In case of failure, report unknown. */ | ||
*vendor = cpuinfo_vendor_unknown; | ||
*uarch = cpuinfo_uarch_unknown; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're going to remove this, can we change this function to report a bool success or failure? The caller should know whether or not to trust the information in the structure.
#include <riscv/api.h> | ||
#include <riscv/linux/api.h> | ||
|
||
#include <linux/version.h> | ||
|
||
#if LINUX_VERSION_CODE >= KERNEL_VERSION(6,4,0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So . . . .philosophical question to @malfet @GlassOfWhiskey . I know current hardware is not all at 6.4 (I guess, to be more fair, most of it isn't), but hwprobe is supposed to be the "proper" way of querying this information. I think it's fair to say that no one likes the prospect of trying to string-parse a proc-file to infer data.
This actually goes back to my comment on the other file about whether we should take this opportunity to just remove all the features that aren't useful to detect (e.g. why do we want to detect 'm', literally everyone will have 'm'). If we do it that way, then the question becomes "if you're only here to detect, say "V", well V isn't available till Linux 6.5, so you must have hwprobe". What do you both think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This #ifdef does not fix the RISC-V build on machines with a 6.4 kernel or greater. When compiling this branch on a RISC-V 64 Ubuntu 23.10 VM, which has a 6.5 kernel, I get the following errors
cpuinfo/src/riscv/linux/riscv-hw.c:11:10: fatal error: sys/hwprobe.h: No such file or directory
11 | #include <sys/hwprobe.h>
| ^~~~~~~~~~~~~~~
compilation terminated.
It's not clear to me where sys/hwprobe.h is supposed to come from. @prashanthswami I think you added this originally. Were you testing with your own build of glibc?
Also, IIUC, with this compile time check on the kernel version, hwprobe will only be used if both the build machine and the system running the binary have a 6.4 kernel or greater. If we build on Ubuntu 22.04 for example but run on 23.10 the hwprobe path would not be taken here even though 23.10 has a 6.5 kernel with hwprobe support.
Other projects that are using hwprobe right now get around this problem by making the syscall directly. OpenJDK duplicates the constants from the kernel headers needed for the sys call and then does a direct syscall. It doesn't include the kernel header files at all. A nicer solution would be to first check to see if the <asm/hwprobe.h> exists either using a build system check that defines a macro if a program that includes <asm/hwprobe.h> builds or perhaps using the trick the glibc patch uses
#ifdef __has_include
# if __has_include (<asm/hwprobe.h>)
# include <asm/hwprobe.h>
# endif
#endif
Regardless of the method used, if the include file is not found, manually defined constants for the hwprobe syscall number and the various bitmasks can be defined, so the syscall can still be issued, regardless of the kernel version on the build machine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My primary use-case was Android on RISC-V, so I likely overcorrected here by testing primarily against the Android NDK (note, there's no official NDK with RISC-V support yet, but the public canary builds work). This header is available in bionic, so building against the NDK does not complain.
The proposal to check for the header explicitly seems reasonable enough to me and there's precedent for it.
As a side note - we should indeed have a CI bot for this. I can't add one for Android just yet, because the common Github workflows don't support canary Android NDKs, but I've filed #206 to do so when it is. Separately, it would probably be valuable to run a linux-riscv64 build to catch stuff like this, so I'll figure out that out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that explains it. It looks like you're pulling in this header which contains a definition of __riscv_hwprobe that appears to be compatible with the pending glibc patch. Ultimately, the thing to do would be to check for <sys/hwprobe.h> and, if it's not present, fall back to code that makes the syscall directly. However, perhaps until the glibc patch is merged, <sys/hwprobe.h> should only be included when building for Android. A manual riscv_hwprobe syscall could then be used for other Linux distributions for the time being.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@malfet, @GlassOfWhiskey , @prashanthswami The build issues on mainline Linux distributions are still not addressed by this patch. It hasn't been possible to build pytorch/cpuinfo on a RISC-V device running Ubuntu 23.10 for over 3 weeks and this patch in its current form won't help. Would there by any objections from anyone if I submit a PR to fix the build issues? The PR would only address the build issues to avoid introducing too many conflicts. Other issues with hwprobe, like the fact we're calling it far too often in most cases, could be addressed in a subsequent PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@markdryan no problem from my side. Also because probably things will go a bit longer in approving this PR as many aspects are involved.
I missed what was the preferred solution among the ones proposed so I didn't do anything on that direction for now. Feel free to proceed as you prefer.
The __has_include
path is probably the most elegant one. It is not fully C standard, but it is now part of the C++ standard since C++17 so I'd vote for that one.
@prashanthswami no problem, it was just a bit of work to merge all the things, but the important thing is to provide community with RISC-V support, no matter how or who :) As a general comment @prashanthswami @malfet: if you care so much about the code style and coherence, why don't you or someone in the maintainers team put a |
In regards to style-adherence, I'll defer to @malfet here, I'm only along for the ride :) my format comments were largely just parroting earlier comments Marat / Nikita gave me. Supportive though of the idea, anything that lets my IDE auto-switch to the right format is desirable. Does PyTorch have top-level guidance about what is considered proper code format? |
#include <riscv/api.h> | ||
#include <riscv/linux/api.h> | ||
|
||
#include <linux/version.h> | ||
|
||
#if LINUX_VERSION_CODE >= KERNEL_VERSION(6,4,0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This #ifdef does not fix the RISC-V build on machines with a 6.4 kernel or greater. When compiling this branch on a RISC-V 64 Ubuntu 23.10 VM, which has a 6.5 kernel, I get the following errors
cpuinfo/src/riscv/linux/riscv-hw.c:11:10: fatal error: sys/hwprobe.h: No such file or directory
11 | #include <sys/hwprobe.h>
| ^~~~~~~~~~~~~~~
compilation terminated.
It's not clear to me where sys/hwprobe.h is supposed to come from. @prashanthswami I think you added this originally. Were you testing with your own build of glibc?
Also, IIUC, with this compile time check on the kernel version, hwprobe will only be used if both the build machine and the system running the binary have a 6.4 kernel or greater. If we build on Ubuntu 22.04 for example but run on 23.10 the hwprobe path would not be taken here even though 23.10 has a 6.5 kernel with hwprobe support.
Other projects that are using hwprobe right now get around this problem by making the syscall directly. OpenJDK duplicates the constants from the kernel headers needed for the sys call and then does a direct syscall. It doesn't include the kernel header files at all. A nicer solution would be to first check to see if the <asm/hwprobe.h> exists either using a build system check that defines a macro if a program that includes <asm/hwprobe.h> builds or perhaps using the trick the glibc patch uses
#ifdef __has_include
# if __has_include (<asm/hwprobe.h>)
# include <asm/hwprobe.h>
# endif
#endif
Regardless of the method used, if the include file is not found, manually defined constants for the hwprobe syscall number and the various bitmasks can be defined, so the syscall can still be issued, regardless of the kernel version on the build machine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple more comments about the current use of riscv_hwprobe. The use of hwprobe I'm commenting on here was introduced in an earlier PR but as we're having the conversation about hwprobe here I thought it worth mentioning. Happy to copy these comments to an issue if this is preferred.
- We're calling hwprobe once for each processor to retrieve the vendor id and uarch. If we have lots of processors this will result in lots of syscalls (and lots of failures on kernels older than 6.4). There's a good chance that all the processors will have the same vendor and uarch, so it's probably worth trying to retrieve the vendor and uarch for all the processors we're interested in a single syscall, outside of a loop. If -1 is returned for either value, different processors do have different values for uarch and vendor id and so they would need to be queried on a processor by processor basis as is currently done.
- We don't use hwprobe to detect the ISA extensions. We're only using hwcap, which cannot detect multi-letter extensions. It would be better to check for extensions with hwprobe first, falling back to hwcap if hwprobe is not available. This could be done in the same call used to retrieve the vendor and uarch for all processors as recommend in point 1 above. Using riscv_hwprobe in this way would allow us to detect Zba, Zbb and Zbs (and more extensions in the future).
087dfdb
to
39dc29d
Compare
No description provided.