Status: November 2024
The 5.1.0 release is an optimization release, giving moderate performance improvements on all platforms. There are no image quality differences.
- General:
- Feature: Added a new CMake build option to control use of native gathers, as they can be slower than scalar loads on some common x86 microarchitectures. Build with
-DASTCENC_X86_GATHERS=OFF
to disable use of native gathers in AVX2 builds. - Optimization: Added new
gather()
abstraction for gathers using byte indices, allowing implementations without gather hardware to skip the byte-to-int index conversion. - Optimization: Optimized
compute_lowest_and_highest_weight()
to pre-compute min/max outside of the main loop. - Optimization: Added improved intrinsics sequence for SSE and AVX2 integer
hmin()
andhmax()
. - Optimization: Added improved intrinsics sequence for
vint4(uint8_t*)
on systems implementing Arm SVE.
- Feature: Added a new CMake build option to control use of native gathers, as they can be slower than scalar loads on some common x86 microarchitectures. Build with
Binary release sha256 checksums
a9f61f954f8c6f75675b9b8187454a3b1328c172cb62747715a30937bb8fe7bb astcenc-5.1.0-linux-x64.zip
4ecb330104e05a35febfc0d83b0c128ee2f51b671577e9952ca4278b5455197b astcenc-5.1.0-macos-universal.zip
ff01a6f988aaf5d7f1a139d2cff4c1988f70252b6f53fa3ba72888cc732ba154 astcenc-5.1.0-windows-arm64.zip
c3e5954b9f263e4e13d92892a9e9ef4728abf4acffbf013bb466dd97cc2a0a15 astcenc-5.1.0-windows-x64.zip