Absolut stands for "Autogenerated Bytewise SIMD-Optimized Look-Up Tables". The following is a breakdown of this jargon:
- Bytewise Lookup Table: One-to-one mappings between sets of bytes.
- SIMD-Optimized: Said lookup tables are implemented using SIMD (Single Instruction Multiple Data)
instructions, such as
PSHUFB
on x86_64 andTBL
on AArch64. - Autogenerated: This crate utilizes procedural macros to generate (if possible) SIMD lookup tables given a human-readable byte-to-byte mapping.
SIMD instructions allow for greater data parallelism when performing table lookups on bytes. This is has proved incredibly useful for high-performance data processing.
Unfortunately, SIMD table lookup instructions (or byte shuffling instructions) operate on tables too small to cover the entire 8-bit integer space. These tables typically have a size of 16 on x86_64, while on AArch64 tables of up to 64 elements are supported.
This library facilitates the generation of SIMD lookup tables from high-level descriptions of byte-to-byte mappings. The goal is to avoid the need to hardcode manually-computed SIMD lookup tables, thus enabling a wider audience to utilize these techniques more easily.
Absolut is essentially a set of procedural macros that accept byte-to-byte mapping descriptions in the form of Rust enums:
#[absolut::one_hot]
pub enum JsonTable {
#[matches(b',')]
Comma,
#[matches(b':')]
Colon,
#[matches(b'[', b']', b'{', b'}')]
Brackets,
#[matches(b'\r', b'\n', b'\t')]
Control,
#[matches(b' ')]
Space,
#[wildcard]
Other,
}
The above JsonTable
enum encodes the following one-to-one mapping:
Input | Output |
---|---|
0x2C |
Comma |
0x3A |
Colon |
0x5B, 0x5D, 0x7B, 0x7D |
Brackets |
0xD, 0xA, 0x9 |
Control |
0x20 |
Space |
* |
Other |
Where *
denotes all other bytes not explicitly mapped.
Mapping results needn't be explicitly defined as Absolut will solve for them automatically.
In the previous code snippet, the expression JsonTable::Space as u8
evaluates to the
output byte when performing a table lookup on 0x20
.
Absolut supports multiple techniques for constructing SIMD lookup tables called algorithms.
Each algorithm is implemented as a procedural macro that accepts byte-to-byte mappings
described using enums with attribute-annotated variants as illustrated
above with the absolut::one_hot
algorithm.
In case a byte-to-byte mapping cannot be implemented using a given Absolut algorithm (i.e. the table is unsatisfiable) the resulting error messages won't be useful for understanding why the algorithm failed to solve for the table. Unless the user is at least vaguely familiar with how the algorithm at play works, it would be difficult for them to figure out how to change the mapping in such a way that it becomes satisfiable and stay useful for their purposes.
Absolut currently does not provide SIMD implementations of lookup routines for the generated lookup tables. However, the library tests contain lookup routines for SSSE3 and NEON.
Absolut is open-source software licensed under the terms of the MIT License.