Skip to content

Releases: ARM-software/astc-encoder

5.1.0

15 Nov 11:44
Compare
Choose a tag to compare

Status: November 2024

The 5.1.0 release is an optimization release, giving moderate performance improvements on all platforms. There are no image quality differences.

  • General:
    • Feature: Added a new CMake build option to control use of native gathers, as they can be slower than scalar loads on some common x86 microarchitectures. Build with -DASTCENC_X86_GATHERS=OFF to disable use of native gathers in AVX2 builds.
    • Optimization: Added new gather() abstraction for gathers using byte indices, allowing implementations without gather hardware to skip the byte-to-int index conversion.
    • Optimization: Optimized compute_lowest_and_highest_weight() to pre-compute min/max outside of the main loop.
    • Optimization: Added improved intrinsics sequence for SSE and AVX2 integer hmin() and hmax().
    • Optimization: Added improved intrinsics sequence for vint4(uint8_t*) on systems implementing Arm SVE.

Binary release sha256 checksums

a9f61f954f8c6f75675b9b8187454a3b1328c172cb62747715a30937bb8fe7bb  astcenc-5.1.0-linux-x64.zip
4ecb330104e05a35febfc0d83b0c128ee2f51b671577e9952ca4278b5455197b  astcenc-5.1.0-macos-universal.zip
ff01a6f988aaf5d7f1a139d2cff4c1988f70252b6f53fa3ba72888cc732ba154  astcenc-5.1.0-windows-arm64.zip
c3e5954b9f263e4e13d92892a9e9ef4728abf4acffbf013bb466dd97cc2a0a15  astcenc-5.1.0-windows-x64.zip

5.0.0

31 Oct 21:59
2ff200e
Compare
Choose a tag to compare

Status: November 2024

The 5.0.0 release is the first stable release in the 5.x series. The main new feature over the 4.x series is support for the Arm Scalable Vector Extensions (SVE) SIMD instruction set, with both 128-bit and 256-bit backends provided. This gives up to 60% performance improvement on Neoverse V2 with its 256-bit SVE implementation.

  • General:
    • Bug fix: Fixed incorrect return type in "None" vector library reference implementation.
    • Bug fix: Fixed sincos table index under/overflow.
    • Feature: Changed ASTCENC_ISA_NATIVE builds to use -march=native and -mcpu=native.
    • Feature: Added backend for Arm SVE fixed-width 256-bit builds. These can only run on hardware implementing 256-bit SVE.
    • Feature: Added backend for Arm SVE 128-bit builds. These are portable builds and can run on hardware implementing any SVE vector length, but the explicit SVE use is augmented NEON and will only use the bottom 128-bits of each SVE vector.
    • Feature: Optimized NEON mask any() and all() functions.
    • Feature: Migrated build and test to GitHub Actions pipelines.

Binary release sha256 checksums

70183c4346f9fc0f55cd8e3ca5b326cbb4675233b026c02df60657e344c44cc0  astcenc-5.0.0-linux-x64.zip
b8e450250932f07765d868318709964da2a36d41be0538f5c018d8fc72a41e70  astcenc-5.0.0-macos-universal.zip
dbaf1a1329f6fd909457b43e7872092e979ba8ebd9bdf9dbbd950069fa533124  astcenc-5.0.0-windows-arm64.zip
6c22b89f3d437d457c45036c8297ce9cec400f8bdfa019eed71e3b8d343f5ebb  astcenc-5.0.0-windows-x64.zip

4.8.0

07 May 14:09
Compare
Choose a tag to compare

Status: May 2024

The 4.8.0 release is a minor maintenance release.

Reminder - the codec library API is not designed to be binary compatible across versions. We always recommend rebuilding your client-side code using the updated astcenc.h header.

  • General:
    • Bug fix: Native builds on macOS will now correctly build for arm64 when run outside of Rosetta on an Apple silicon device.
    • Bug fix: Multiple small improvements to remove use of undefined language behavior, to improve support for deployment using Emscripten.
    • Feature: Builds using Clang can now build with undefined behavior sanitizer by setting -DASTCENC_UBSAN=ON on the CMake configure line.
    • Feature: Updated to Wuffs library 0.3.4, which ignores tRNS alpha chunks for type 4 (LA) and 6 (RGBA) PNGs, to improve compatibility with libpng.

Binary release sha256 checksums

dfd7fb8056aeb1c9fa19d8c101f9a6a710ffa2000e62d5a536d4a7ca06c81c8d  astcenc-4.8.0-linux-x64.zip
26b82cac7a41c99c4a5789df5025d1bc42a02067be4670a5f4639e176083cee2  astcenc-4.8.0-macos-universal.zip
cdf073b42c86e766959e0c363b84b16ca4908b0d1a6304eafa6ae8d32b2dd393  astcenc-4.8.0-windows-arm64.zip
a821d4c58fa5bb5ecf421aabbdb514f0083baea0c0bc0e6d10fba65fe3ceb7b2  astcenc-4.8.0-windows-x64.zip

4.7.0

11 Jan 21:54
Compare
Choose a tag to compare

Status: January 2024

The 4.7.0 release is a major maintenance release, fixing rounding behavior in the decompressor to match the Khronos specification. This fix includes the addition of explicit support for optimizing for decode_unorm8 rounding.

Reminder - the codec library API is not designed to be binary compatible across versions. We always recommend rebuilding your client-side code using the updated astcenc.h header.

  • General:
    • Bug fix: sRGB LDR decompression now uses the correct endpoint expansion method to create the 16-bit RGB endpoint colors, and removes the previous correction code from the interpolation function. This bug could result in LSB bit flips relative to the standard specification.
    • Bug fix: Decompressing to an 8-bit per component output image now matches the decode_unorm8 extension rounding rules. This bug could result in LSB bit flips relative to the standard specification.
    • Bug fix: Code now avoids using alignas() in the reference C implementation, as the default alignas(16) is narrower than the native minimum alignment requirement on some CPU architectures.
    • Feature: Library configuration supports a new flag, ASTCENC_FLG_USE_DECODE_UNORM8. This flag indicates that the image will be used with the decode_unorm8 decode mode. When set during compression this allows the compressor to use the correct rounding when determining the best encoding.
    • Feature: Command line tool supports a new option, -decode_unorm8. This option indicates that the image will be used with the decode_unorm8 decode mode. This option will automatically be set for decompression (-d*) and trial (-t*) tool operation if the decompressed output image is stored to an 8-bit per component file format. This option must be set manually for compression (-c*) tool operation, as the desired decode mode cannot be reliably determined.
    • Feature: Library configuration supports a new optional progress reporting callback to be specified. This is called during compression to to allow interactive tooling use cases to display incremental progress. The command line tool uses this feature to show compression progress unless -silent is used.
    • Feature: Prebuilt Linux binary releases updated to use Clang 14 (previously Clang 9) which gives a small performance improvement.

Binary release sha256 checksums

4f596546808c58f2b7e0271302d05df40159cdc7c47645cc96f331229d22ab66  astcenc-4.7.0-linux-x64.zip
176b2c0bf2673d15eb1ba11c1e2691ccb041b79e4076fccd6cd0ece3f75aa2bc  astcenc-4.7.0-macos-universal.zip
092bc9023195c9ccef811c97f1e0746e0f803a5397e5a18773566615b077914d  astcenc-4.7.0-windows-arm64.zip
deb20ea0cb4ef522ca1eecee82358bfc9cdafd8d4101bc22f28502a0165180bd  astcenc-4.7.0-windows-x64.zip

4.6.1

24 Nov 20:04
aeece2f
Compare
Choose a tag to compare

Status: November 2023

The 4.6.1 release is a minor maintenance release to fix a performance scaling issue on large core count Windows systems. No other performance or image changes are expected for this release.

  • General:
    • Optimization: Windows builds of the astcenc command line tool can now use more than 64 cores on large core count systems. This change doubles command line performance for -exhastive compression when testing on an 96 core/192 thread system.
    • Feature: Windows Arm64 native builds of the astcenc command line tool are now included in the prebuilt release binaries.

Binary release sha256 checksums

e360aeabf3b5aeda6a7cfabddc49af8b204e28befa04ab8e8942c85620ba071a  astcenc-4.6.1-linux-x64.zip
40f19df27799f6f2ad6890c147165f8e077ff6547be57b02d7949677d3f1ea9e  astcenc-4.6.1-macos-universal.zip
92cd085b6a2f8f748fd384f2dc8e3c977756ffbe13e263729695cd18233b55a8  astcenc-4.6.1-windows-arm64.zip
0c4ba7af8b5ec22e9bd4f4173866985d2d10b5be137753171481b9629054e38e  astcenc-4.6.1-windows-x64.zip

4.6.0

07 Nov 12:30
Compare
Choose a tag to compare

Status: November 2023

The 4.6.0 release is a minor release with a few code quality improvements, and a small performance boost.

Reminder - the codec library API is not designed to be binary compatible across versions. We always recommend rebuilding your client-side code using the updated astcenc.h header.

  • General:
    • Bug-fix: Fixed context allocation for contexts allocated with the ASTCENC_FLG_DECOMPRESS_ONLY flag.
    • Bug-fix: Reduced use of reinterpret_cast in the core codec to avoid strict aliasing violations.
    • Optimization: -medium search quality no longer tests 4 partition encodings for block sizes between 25 and 83 texels (inclusive). This improves performance for a tiny drop in image quality.
    • Optimization: -thorough and higher search qualities no longer test the mode0 first search for block sizes between 25 and 83 texels (inclusive). This improves performance for a tiny drop in image quality.
    • Optimization: TUNE_MAX_PARTITIONING_CANDIDATES reduced from 32 to 8 to reduce the size of stack allocated data structures. This causes a tiny drop in image quality for the -verythorough and -exhaustive presets.

Binary release sha256 checksums

321229025183e9f8f1cdb766b1a036da33a192d56e46bdf8b44295759c36fc9e  astcenc-4.6.0-linux-x64.zip
19adae19a7a46b05739fe9285dbac6c960e08504780890bbef4f8eca6663e0d7  astcenc-4.6.0-macos-universal.zip
a140454aa6c2dee29e85a1dc162430a6914123aea62546bfad9885ca336bff24  astcenc-4.6.0-windows-x64.zip

4.5.0

20 Jun 12:58
Compare
Choose a tag to compare

Status: June 2023

The 4.5.0 release is a minor release with minor image quality improvements, and multiple build system quality-of-life improvements.

  • General:
    • Bug-fix: Improved handling compiler arguments in CMake, including consistent use of MSVC-style command line arguments for ClangCL.
    • Bug-fix: Invariant Clang builds now use -ffp-model=precise with -ffp-contract=off which is needed to restore invariance due to recent changes in compiler defaults.
    • Change: macOS binary releases are now distributed as a single universal binary for all platforms.
    • Change: Windows binary releases are now compiled with VS2022.
    • Change: Invariant MSVC builds for VS2022 now use /fp:precise instead of /fp:strict, which is is now possible because precise no longer implies contraction. This should improve performance for MSVC builds.
    • Change: Non-invariant Clang builds now use -ffp-model=precise with -ffp-contract=on. This should improve performance on older Clang versions which defaulted to no contraction.
    • Change: Non-invariant MSVC builds for VS2022 now use /fp:precise with /fp:contract. This should improve performance for MSVC builds.
    • Change: CMake config variables now use an ASTCENC_ prefix to add a namespace and group options when the library is used in a larger project.
    • Change: CMake config ASTCENC_UNIVERSAL_BUILD for building macOS universal binaries has been improved to include the x86_64h slice for AVX2 builds. Universal builds are now on by default for macOS, and always include NEON (arm64), SSE4.1 (x86_64), and AVX2 (x86_64h) variants.
    • Change: CMake config ASTCENC_NO_INVARIANCE has been inverted to remove the negated option, and is now ASTCENC_INVARIANCE with a default of ON. Disabling this option can substantially improve performance, but images can different across platforms and compilers.
    • Optimization: Color quantization and packing for LDR RGB and RGBA has been vectorized to improve performance.
    • Change: Color quantization for LDR RGB and RGBA endpoints will now try multiple quantization packing methods, and pick the one with the lowest endpoint encoding error. This gives a minor image quality improvement, for no significant performance impact when combined with the vectorization optimizations.

Binary release sha256 checksums

fe2a1e5c8e57fc77175c6e9d0e1a10e583816507c82c748c568694bf39ae9f57  astcenc-4.5.0-linux-x64.zip
bc8895222820106135575b7dd1bef4d9f184be7ef7de6e684ab563b64c22d163  astcenc-4.5.0-macos-universal.zip
7c63f167558c65e607f72a4dc30b86d631ae3b561cffaee9d2afb5d5149d72e0  astcenc-4.5.0-windows-x64.zip

4.4.0

31 Mar 16:50
Compare
Choose a tag to compare

Status: March 2023

The 4.4.0 release is a minor release with image quality improvements, a small performance boost, a few new quality-of-life features, and a few minor fixes for uncommon build configurations.

  • General:
    • Change: Core library no longer checks availability of required instruction set extensions, such as SSE4.1 or AVX2. Checking compatibility is now the responsibility of the caller. See astcenccli_entry.cpp for an example of code performing this check.
    • Change: Core library can be built as a shared object by setting the -DSHAREDLIB=ON CMake option, resulting in e.g. libastcenc-avx2-shared.so. Note that the command line tool is always statically linked.
    • Change: Decompressed 3D images will now write one output file per slice, if the target format is a 2D image format, rather than just writing slice zero.
    • Change: Command line tool errors print to stderr instead of stdout.
    • Change: Color encoding uses new quantization tables, that now factor in floating-point rounding if a distance tie is found when using the integer quant256 value. This improves image quality for 4x4 and 5x5 block sizes.
    • Optimization: Partition selection uses a simplified line calculation with a faster approximation. This improves performance for all block sizes.
    • Bug-fix: Fixed missing symbol error in dead code for decompressor-only builds.
    • Bug-fix: Fixed infinity handling in debug trace JSON files.

Binary release sha256 checksums

98267b6e23f188658de1e275816bad6bf1e9fe3ae113a1bac4109bf7ee75d579  astcenc-4.4.0-linux-x64.zip
d88c90c82f0e5cf15b9fae7582da394e5855a7cdeee776887351b07f5b6d21ea  astcenc-4.4.0-macos-aarch64.zip
8f97f78dd9bedc8a21cfdafdd712ee1f8ec29414804c4e38bd630f4c8894517f  astcenc-4.4.0-macos-x64.zip
e5407a8d7c4a0355aa1f0614aee21f3407c7f10cac2349bf3e57e8aaa4a21d53  astcenc-4.4.0-windows-x64.zip

4.3.1

30 Jan 22:58
Compare
Choose a tag to compare

Status: January 2023

The 4.3.1 release is a minor maintenance release. No performance or image quality changes are expected.

  • General:
    • Bug-fix: Fixed typo in -2/3/4partitioncandidatelimit CLI options.
    • Bug-fix: Fixed handling for -3/4partitionindexlimit CLI options.
    • Bug-fix: Updated to stb_image.h v2.28, which includes multiple fixes for image loading.

Binary release sha256 checksums

93ccb1c96493a9066487f641c2bc5eca87c7ec58b2270f7645af95db795dda7f  astcenc-4.3.1-linux-x64.zip
ddd548181b5beda535171c6eda6280ff99e175868cc75e22efc65bbcccd37fa9  astcenc-4.3.1-macos-aarch64.zip
eaa425ad34a0455960bb660271fe853ba374543adf6c22823dbaa96d8aff7593  astcenc-4.3.1-macos-x64.zip
95bbd32112fcf21ef17927dcc4d446b24bfc5838efd9c4d8bb34edbfb85849e8  astcenc-4.3.1-windows-x64.zip

4.3.0

18 Jan 08:44
ec83dda
Compare
Choose a tag to compare

Status: January 2023

The 4.3.0 release is an optimization release. There are minor performance improvements, minor image quality improvements, significant memory footprint improvements, and library interface changes in this release.

Reminder - the codec library API is not designed to be binary compatible across versions. We always recommend rebuilding your client-side code using the updated astcenc.h header.

  • General:
    • Bug-fix: Use lower case windows.h include for MinGW compatibility.
    • Change: The -mask command line option, ASTCENC_FLG_MAP_MASK in the library API, has been removed.
    • Optimization: Always skip blue-contraction for QUANT_256 encodings. This gives a small image quality improvement for images using the 4x4 block size.
    • Optimization: Always skip RGBO vector calculation for LDR encodings.
    • Optimization: Defer color packing and scrambling to physical layer.
    • Optimization: Remove folded decimation_info lookup tables. This significantly reduces compressor memory footprint and improves context creation time. Impact increases with block size.
    • Optimization: Increased trial and refinement pruning by using stricter target errors when determining whether to skip iterations.

Binary release sha256 checksums

7d74df06ec5e186f1ae2c3718df4eac649f80b0d3e38bcd1378a8803574e5459  astcenc-4.3.0-linux-x64.zip
1e3ad1c07f234c97d2668639910fc032677b5a7716425e478cba219708cc6a65  astcenc-4.3.0-macos-aarch64.zip
fa6d1322c59c8b1cdbfbf607bdfe546d6325c6fe067c740f704e026efdacbe43  astcenc-4.3.0-macos-x64.zip
e3b648bcbbc016f67091aef5563983cafeffc8c3feb3ef6a8d4e988fabce152a  astcenc-4.3.0-windows-x64.zip