Neuron 2.21.1 release pins Transformers NeuronX dependency to transformers<4.48 and fixes DMA abort errors on Trn2.
Additionally, this release addresses NxD Core and Training improvements, including fixes for sequence parallel support in quantized models and a new flag for dtype control in Llama3/3.1 70B configurations. See NxD Training Release Notes (neuronx-distributed-training) for details.
NxD Inference update includes minor bug fixes for sampling parameters. See NxD Inference Release Notes.
Neuron supported DLAMIs and DLCs have been updated to Neuron 2.21.1 SDK. Users should be aware of an incompatibility between Tensorflow-Neuron 2.10 (Inf1) and Neuron Runtime 2.21 in DLAMIs, which will be addressed in the next minor release. See Neuron DLAMI Release Notes.
The Neuron Compiler includes bug fixes and performance enhancements specifically targeting the Trn2 platform.