rdrand: Avoid inlining unrolled retry loops.

The rdrand implementation contains three calls to rdrand(): 1. One in the main loop, for full words of output. 2. One after the main loop, for the potential partial word of output. 3. One inside the self-test loop. In the first two cases, each call is unrolled into: ``` rdrand <register> jb <success> rdrand <register> jb <success> rdrand <register> jb <success> rdrand <register> jb <success> rdrand <register> jb <success> rdrand <register> jb <success> rdrand <register> jb <success> rdrand <register> jb <success> rdrand <register> jb <success> rdrand <register> jb <success> ``` In the third case, the self-test loop, the same unrolling happens, but then the self-test loop is also unrolled, so the result is a sequence of 160 instructions. With this change, each call to `rdrand()` now looks like this: ``` rdrand <register> jb <success> call retry test rax, rax jne ... jmp ... ``` The loop in `retry()` still gets unrolled though. Since rdrand will basically never fail, the `jb <success>` in each call is going to be predicted as succeeding, so the number of instructions doesn't change. But, instruction cache pressure should be reduced.
rust-random · May 31, 2024 · e28dd9c · e28dd9c
1 parent b2bca0f
commit e28dd9c
Showing 1 changed file with 20 additions and 10 deletions.
diff --git a/src/rdrand.rs b/src/rdrand.rs
@@ -14,20 +14,30 @@ cfg_if! {
     }
 }
 
-// Recommendation from "Intel® Digital Random Number Generator (DRNG) Software
-// Implementation Guide" - Section 5.2.1 and "Intel® 64 and IA-32 Architectures
-// Software Developer’s Manual" - Volume 1 - Section 7.3.17.1.
-const RETRY_LIMIT: usize = 10;
-
 #[target_feature(enable = "rdrand")]
 unsafe fn rdrand() -> Option<Word> {
-    for _ in 0..RETRY_LIMIT {
-        let mut val = 0;
-        if rdrand_step(&mut val) == 1 {
-            return Some(val);
+    #[cold]
+    unsafe fn retry() -> Option<Word> {
+        // Recommendation from "Intel® Digital Random Number Generator (DRNG) Software
+        // Implementation Guide" - Section 5.2.1 and "Intel® 64 and IA-32 Architectures
+        // Software Developer’s Manual" - Volume 1 - Section 7.3.17.1.
+
+        // Start at 1 because the caller already tried once.
+        for _ in 1..10 {
+            let mut val = 0;
+            if rdrand_step(&mut val) == 1 {
+                return Some(val);
+            }
         }
+        None
+    }
+
+    let mut val = 0;
+    if rdrand_step(&mut val) == 1 {
+        Some(val)
+    } else {
+        retry()
     }
-    None
 }
 
 // "rdrand" target feature requires "+rdrand" flag, see https://github.com/rust-lang/rust/issues/49653.