arm_float_to_q31: why not to use VCVT.F32.S32 on Cortex-M4F #212

AlanCui4080 · 2024-10-21T06:20:35Z

Hi,

As what we know, Cortex-M4F implemented a little set of FPU instructions including VCVT.F32.S32, and GCC did have a builtin intrinsic for it.
But the question is why the intrinsic is only enabled when NEON available in GCC, and also, why not to use it in arm_float_to_q31.

Alan.

christophe0606 · 2024-10-21T08:34:51Z

@AlanCui4080 I don't see any reason. Perhaps the function was first developped for M0 and was not upgraded to support all other architectures.

AlanCui4080 · 2024-10-21T08:55:03Z

@christophe0606

Sorry for my mistake, it's ok to include arm_neon.h on M4F, even there is only a subset of NEON implemented on M4F, but the vcvt_s32_f32 will call a SIMD instruction "FCVTZS Vd.2S,Vn.2S" which is invalid on M4F. The only vaild one is "VCVT.F32.S32 Sd,Sm #fbits" included in FPv4-SP.
I'm testing it on my STM32G4, if that be ok, i will put a pull request.

christophe0606 · 2024-10-21T09:04:41Z

@AlanCui4080 I won't include arm_neon.h for M4F.

There are thus two possibilities : this intrinsics is supported by the Arm C language extensions (ACLE). Unfortunately, there are still too many compilers that are not fully implementing all of the ACLE.

That's why most of the intrinsics used by CMSIS-DSP are coming from CMSIS-Core (part of CMSIS-6). And probably, you'll need to open an issue on CMSIS-Core if you want this new intrinsic to be supported

AlanCui4080 · 2024-10-21T09:29:48Z

@christophe0606
I figure out it, ACLE have no single precision version (FPv4-SP) for vcvt_s32_f32, it's either d-register or q-register in ACLE.
So I do believe someone forget to add it into ACLE, because this instruction is unpopular and not wide used.

And the CMSIS-Core seems to be not include any part of FPU instructions, where should i put it in.

Note: following inline asm is proved usable as a replacement of arm_float_to_q31.

asm("vcvt.s32.f32 %0, %0, #31": "+t"(thetaf_divpi.f)::);

AlanCui4080 · 2024-10-31T09:35:20Z

@christophe0606
Hi,
Is that ok to add a inline assembly specilized for M4? If so, i will open a pull request.

christophe0606 · 2024-10-31T09:41:00Z

@AlanCui4080 asm definitions are provided through CMSIS-Core by defining new intrinsics macros in the compiler headers

So you can try to open a github issue on the CMSIS repository.

JonatanAntoni · 2024-11-04T11:11:11Z

@christophe0606, @AlanCui4080, where does this new instrinsic belong to?
Is this one the correct location? https://github.com/ARM-software/CMSIS_6/blob/8c4dc58928b3347f6aa98b6fb2bf6770f32a72b7/CMSIS/Core/Include/cmsis_gcc.h#L877-L992

Please feel free to raise a PR adding the required define. Please bare in mind that this is location not specific to Cortex-M4F. Nor is is specialized for Arm-v7EM but is pulled for A- and R-class as well. If this is strictly for Arm-v7EM, it would need to go into https://github.com/ARM-software/CMSIS_6/blob/main/CMSIS/Core/Include/m-profile/cmsis_gcc_m.h et al.

christophe0606 · 2024-11-04T11:44:02Z

@JonatanAntoni @AlanCui4080 It must be added to each compiler and would be for Cortex-M only. (Since the intrinsics is defined with Neon header).

I don't have compiler specific implementation (except in case of major compiler bug) in CMSIS-DSP so this new intrinsics should be made available for all compiler so that CMSIS-DSP can still be built with all of them.

Otherwise, it is now possible to define functions as WEAK for linker in CMSIS-DSP. So, another possibility is for a user to replace a specific function with a different implementation if for some reason we can't do it on our side.

AlanCui4080 · 2024-11-04T19:50:22Z

@christophe0606, @AlanCui4080, where does this new instrinsic belong to? Is this one the correct location? https://github.com/ARM-software/CMSIS_6/blob/8c4dc58928b3347f6aa98b6fb2bf6770f32a72b7/CMSIS/Core/Include/cmsis_gcc.h#L877-L992

@JonatanAntoni
No, that is not a DSP instruction, it's belong to FPU. I have never seen intrinsics for FPUs in CMSIS-Core.
I think that's because CMSIS-DSP is designed for fixed-point at beginning, and FPU instructions are duty of compliers.

Meanwhile, there is no complier include intrinsics for FPU instuctions (But have MVE&NEON) to itself. Well, so i think people all focused on newer architecture such one have MVE. So armv7 old FPU just be forgot like 8087 insturctions.

See https://developer.arm.com/documentation/ddi0439/b/BEHJADED, since there is no absolute value statement, square root statement, or q31 to float statement in C language, compliers will never generate such instructions to accelerate program running. For newlib, they wrote a inline asm for sqrtf() see https://github.com/bminor/newlib/blob/master/newlib/libm/machine/arm/e_sqrt.c, this fact also pointed out that "there is no complier include intrinsics for FPU instuctions". By the way, i didn't see they did the same thing for "FABS.F32" or "FCVT.XX.XX", may because just no one cared.

These FPU instructions are orphans, i have no idea about where to add them.

christophe0606 added question Further information is requested review Under review enhancement New feature or request and removed review Under review labels Oct 21, 2024

AlanCui4080 mentioned this issue Oct 31, 2024

VCVT.F32.S32: missing in intrinsics ARM-software/CMSIS_6#197

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

arm_float_to_q31: why not to use VCVT.F32.S32 on Cortex-M4F #212

arm_float_to_q31: why not to use VCVT.F32.S32 on Cortex-M4F #212

AlanCui4080 commented Oct 21, 2024

christophe0606 commented Oct 21, 2024

AlanCui4080 commented Oct 21, 2024 •

edited

Loading

christophe0606 commented Oct 21, 2024 •

edited

Loading

AlanCui4080 commented Oct 21, 2024 •

edited

Loading

AlanCui4080 commented Oct 31, 2024

christophe0606 commented Oct 31, 2024

JonatanAntoni commented Nov 4, 2024

christophe0606 commented Nov 4, 2024

AlanCui4080 commented Nov 4, 2024

arm_float_to_q31: why not to use VCVT.F32.S32 on Cortex-M4F #212

arm_float_to_q31: why not to use VCVT.F32.S32 on Cortex-M4F #212

Comments

AlanCui4080 commented Oct 21, 2024

christophe0606 commented Oct 21, 2024

AlanCui4080 commented Oct 21, 2024 • edited Loading

christophe0606 commented Oct 21, 2024 • edited Loading

AlanCui4080 commented Oct 21, 2024 • edited Loading

AlanCui4080 commented Oct 31, 2024

christophe0606 commented Oct 31, 2024

JonatanAntoni commented Nov 4, 2024

christophe0606 commented Nov 4, 2024

AlanCui4080 commented Nov 4, 2024

AlanCui4080 commented Oct 21, 2024 •

edited

Loading

christophe0606 commented Oct 21, 2024 •

edited

Loading

AlanCui4080 commented Oct 21, 2024 •

edited

Loading