diff --git a/README.md b/README.md index fa5be4a..da8f2aa 100644 --- a/README.md +++ b/README.md @@ -8,13 +8,13 @@ crc64fast-nvme SIMD-accelerated carryless-multiplication [CRC-64/NVME](https://reveng.sourceforge.io/crc-catalogue/all.htm#crc.cat.crc-64-nvme) checksum computation (similar to [crc32fast](https://crates.io/crates/crc32fast) and forked from [crc64fast](https://github.com/tikv/crc64fast) which calculates [CRC-64/XZ](https://reveng.sourceforge.io/crc-catalogue/all.htm#crc.cat.crc-64-xz) [a.k.a `CRC-64/GO-ECMA`]). -`CRC-64/NVME` comes from the [NVM Express® NVM Command Set Specification](https://nvmexpress.org/wp-content/uploads/NVM-Express-NVM-Command-Set-Specification-1.0d-2023.12.28-Ratified.pdf) (Revision 1.0d, December 2023) and has also been implemented in the [Linux kernel](https://github.com/torvalds/linux/blob/786c8248dbd33a5a7a07f7c6e55a7bfc68d2ca48/lib/crc64.c#L66-L73) (where it's called `CRC-64/Rocksoft`). Note that the Check value in the spec is incorrect (Section 5.2.1.3.4, Figure 120, page 83). +`CRC-64/NVME` comes from the [NVM Express® NVM Command Set Specification](https://nvmexpress.org/wp-content/uploads/NVM-Express-NVM-Command-Set-Specification-1.0d-2023.12.28-Ratified.pdf) (Revision 1.0d, December 2023) and has also been implemented in the [Linux kernel](https://github.com/torvalds/linux/blob/786c8248dbd33a5a7a07f7c6e55a7bfc68d2ca48/lib/crc64.c#L66-L73) (where it's called `CRC-64/Rocksoft`). Note that the Check value in the spec uses incorrect endianness (Section 5.2.1.3.4, Figure 120, page 83). SIMD-accelerated carryless-multiplication is based on the Intel [Fast CRC Computation for Generic Polynomials Using PCLMULQDQ Instruction](https://web.archive.org/web/20131224125630/https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/fast-crc-computation-generic-polynomials-pclmulqdq-paper.pdf) paper. ## Changes from [crc64fast](https://github.com/tikv/crc64fast) -Primarily changes the `CRC-64/XZ` (aka `CRC-64/GO-ECMA`) polynomial from [crc64fast](https://github.com/tikv/crc64fast) (which uses the `ECMA-182` polynomial [`0x42F0E1EBA9EA3693`]) to use the `NVME` polynomial (`0xAD93D23594C93659`), plus re-calculates the input parameters (tables, keys, mu, and reciprocal polynomial) for carryless-multiplication. +Primarily changes the `CRC-64/XZ` (aka `CRC-64/GO-ECMA`) polynomial from [crc64fast](https://github.com/tikv/crc64fast) (which uses the `ECMA-182` polynomial [`0x42F0E1EBA9EA3693`]) to use the `NVME` polynomial (`0xAD93D23594C93659`), plus re-calculates the input parameters (tables, keys, mu, and reciprocal polynomial) for fast operations. ## Usage @@ -29,14 +29,14 @@ assert_eq!(checksum, 0xd9160d1fa8e418e3); ``` ## CLI example -A simple CLI implementation can be found in [crc_64_nvme_checksum.rs](src\bin\crc_64_nvme_checksum.rs), which will calculate the CRC-64/NVME checksum for a file on disk. +A simple CLI implementation can be found in [crc_64_nvme_checksum.rs](src\bin\crc_64_nvme_checksum.rs), which will calculate the `CRC-64/NVME` checksum for a file on disk. ## Other CRC-64 implementations -Tooling to re-calculate input parameters for other CRC-64 implementations/polynomials is supplied in [src\bin](src\bin). +Tooling to re-calculate input parameters for other `CRC-64` implementations/polynomials is supplied in [src\bin](src\bin). ## Performance -`crc64fast` provides two fast implementations, and the most performance one will +`crc64fast-nvme` provides two fast implementations, and the most performance one will be chosen based on CPU feature at runtime. * a fast, platform-agnostic table-based implementation, processing 16 bytes at a time. @@ -44,27 +44,30 @@ be chosen based on CPU feature at runtime. * using PCLMULQDQ + SSE 4.1 on x86/x86_64 * using PMULL + NEON on AArch64 (64-bit ARM) -| Algorithm | Throughput (x86_64) | Throughput (aarch64) | -|:------------------|--------------------:|---------------------:| -| [crc 3.0.1] | 0.5 GiB/s | 0.3 GiB/s | -| crc64fast (table) | 2.3 GiB/s | 1.8 GiB/s | -| crc64fast (simd) | 28.2 GiB/s | 20.0 GiB/s | +| Algorithm | Throughput (x86_64) | Throughput (aarch64) | +|:-----------------------|--------------------:|---------------------:| +| [crc 3.0.1] | 0.5 GiB/s | 0.3 GiB/s | +| crc64fast-nvme (table) | 2.3 GiB/s | 1.8 GiB/s | +| crc64fast-nvme (simd) | 28.2 GiB/s | 20.0 GiB/s | [crc 3.0.1]: https://docs.rs/crc/3.0.1/crc/index.html -## TODO +## References -This crate is mainly intended for use in TiKV only. -Features beyond AArch64 are unlikely to be implemented. - -* [x] AArch64 support based on PMULL -* [ ] `no_std` support -* [x] Fuzz test -* [ ] Custom polynomial +* [crc32-fast](https://crates.io/crates/crc32fast) - Original `crc32` implementation in Rust. +* [crc64-fast](https://github.com/tikv/crc64fast) - Original `CRC-64/XZ` implementation in Rust (from which this project was forked). +* [Fast CRC Computation for Generic Polynomials Using PCLMULQDQ Instruction](https://web.archive.org/web/20131224125630/https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/fast-crc-computation-generic-polynomials-pclmulqdq-paper.pdf) - Intel's paper. +* [NVM Express® NVM Command Set Specification](https://nvmexpress.org/wp-content/uploads/NVM-Express-NVM-Command-Set-Specification-1.0d-2023.12.28-Ratified.pdf) - The NVMe spec, including `CRC-64-NVME` (with incorrect endian Check value). +* [CRC-64/NVME](https://reveng.sourceforge.io/crc-catalogue/all.htm#crc.cat.crc-64-nvme) - The `CRC-64/NVME` quick definition. +* [Linux implementation](https://github.com/torvalds/linux/blob/786c8248dbd33a5a7a07f7c6e55a7bfc68d2ca48/lib/crc64.c) - Linux implementation of `CRC-64/NVME`. +* [C++ artifacts implementation](https://github.com/jeffareid/crc/blob/master/crc64r/crc64rg.cpp) - Inspiration C++ for the Rust code in [calculate_pclmulqdq_artifacts.rs](src\bin\calculate_pclmulqdq_artifacts.rs). +* [Intel isa-l GH issue #88](https://github.com/intel/isa-l/issues/88) - Additional insight into generating artifacts. +* [StackOverflow PCLMULQDQ CRC32 answer](https://stackoverflow.com/questions/71328336/fast-crc-with-pclmulqdq-not-reflected/71329114#71329114) - Insightful answer to implementation details for CRC32. +* [StackOverflow PCLMULQDQ CRC32 question](https://stackoverflow.com/questions/21171733/calculating-constants-for-crc32-using-pclmulqdq) - Insightful question & answer to CRC32 implementation details. ## License -crc64fast is dual-licensed under +`crc64fast-nvme` is dual-licensed under * Apache 2.0 license ([LICENSE-Apache](./LICENSE-Apache) or ) * MIT license ([LICENSE-MIT](./LICENSE-MIT) or )