Skip to content

Commit

Permalink
Neuron SDK Release 2.20.2
Browse files Browse the repository at this point in the history
Release notes for Neuron SDK Release 2.20.2

---------

Co-authored-by: Jeffrey Huynh <[email protected]>
Co-authored-by: Nathan Mailhot <[email protected]>
Co-authored-by: Ahsan Khan <[email protected]>
Co-authored-by: Mounik Chinthapanti <[email protected]>
Co-authored-by: Nathan Mailhot <[email protected]>
Co-authored-by: Ryan King <[email protected]>
Co-authored-by: musunita <[email protected]>
Co-authored-by: Vikas Paliwal <[email protected]>
Co-authored-by: Pradeep Roy <[email protected]>
Co-authored-by: Esha Lakhotia <[email protected]>
Co-authored-by: Nicholas Waldron <[email protected]>
Co-authored-by: Roopnath <[email protected]>
  • Loading branch information
13 people committed Nov 21, 2024
1 parent 79a71b5 commit a574acb
Show file tree
Hide file tree
Showing 13 changed files with 164 additions and 16 deletions.
2 changes: 1 addition & 1 deletion conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -157,7 +157,7 @@

#top_banner_message="<span>&#9888;</span><a class='reference internal' style='color:white;' href='https://awsdocs-neuron.readthedocs-hosted.com/en/latest/general/setup/setup-troubleshooting.html#gpg-key-update'> Neuron repository GPG key for Ubuntu installation has expired, see instructions how to update! </a>"

top_banner_message="Neuron 2.20.1 is released! check <a class='reference internal' style='color:white;' href='https://awsdocs-neuron.readthedocs-hosted.com/en/latest/release-notes/index.html#latest-neuron-release'> What's New </a> and <a class='reference internal' style='color:white;' href='https://awsdocs-neuron.readthedocs-hosted.com/en/latest/general/announcements/index.html'> Announcements </a>"
top_banner_message="Neuron 2.20.2 is released! check <a class='reference internal' style='color:white;' href='https://awsdocs-neuron.readthedocs-hosted.com/en/latest/release-notes/index.html#latest-neuron-release'> What's New </a> and <a class='reference internal' style='color:white;' href='https://awsdocs-neuron.readthedocs-hosted.com/en/latest/general/announcements/index.html'> Announcements </a>"

html_theme = "sphinx_book_theme"
html_theme_options = {
Expand Down
7 changes: 7 additions & 0 deletions release-notes/containers/neuron-dlc.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,18 @@ Neuron DLC Release Notes
:local:
:depth: 1

Neuron 2.20.2
-------------
Date: 11/20/2024

- Neuron 2.20.2 DLC fixes dependency bug for NxDT use case by pinning the correct torch version.


Neuron 2.20.1
-------------

Date: 10/25/2024

- Neuron 2.20.1 DLC includes prerequisites for :ref:`nxdt_installation_guide`. Customers can expect to use NxDT out of the box.


Expand Down
10 changes: 10 additions & 0 deletions release-notes/containers/neuron-k8.rst
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,16 @@ To Pull the Images from ECR:
docker pull public.ecr.aws/neuron/neuron-scheduler:2.x.y.z


Neuron K8 release [2.22.20.0]
=============================

Date: 11/20/2024

Bug fixes
---------

- This release addresses a stability issue in the Neuron Scheduler Extension that previously caused crashes shortly after installation.

Neuron K8 release [2.22.4.0]
============================

Expand Down
27 changes: 22 additions & 5 deletions release-notes/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,23 @@ What's New
.. _neuron-2.20.0-whatsnew:


Neuron 2.20.2 (11/20/2024)
---------------------------

Neuron 2.20.2 release fixes a stability issue in Neuron Scheduler Extension that previously caused crashes in Kubernetes (K8) deployments. See :ref:`neuron-k8-rn`.

This release also addresses a security patch update to Neuron Driver that fixes a kernel address leak issue.
See more on :ref:`neuron-driver-release-notes` and :ref:`neuron-runtime-rn`.

Addtionally, Neuron 2.20.2 release updates ``torch-neuronx`` and ``libneuronxla`` packages to add support for ``torch-xla`` 2.1.5 package
which fixes checkpoint loading issues with Zero Redundancy Optimizer (ZeRO-1). See :ref:`torch-neuronx-rn` and :ref:`libneuronxla-rn`.

Neuron supported DLAMIs and DLCs are updated with this release (Neuron 2.20.2 SDK). The Training DLC is also updated to address the
version dependency issues in NxD Training library. See :ref:`neuron-dlc-release-notes`.

NxD Training library in Neuron 2.20.2 release is updated to transformers 4.36.0 package. See :ref:`neuronx-distributed-training-rn`.


Neuron 2.20.1 (10/25/2024)
---------------------------

Expand Down Expand Up @@ -399,27 +416,27 @@ Release Artifacts
Trn1 packages
^^^^^^^^^^^^^^

.. program-output:: python3 src/helperscripts/n2-helper.py --list=packages --instance=trn1 --file=src/helperscripts/n2-manifest.json --neuron-version=2.20.1
.. program-output:: python3 src/helperscripts/n2-helper.py --list=packages --instance=trn1 --file=src/helperscripts/n2-manifest.json --neuron-version=2.20.2

Inf2 packages
^^^^^^^^^^^^^^

.. program-output:: python3 src/helperscripts/n2-helper.py --list=packages --instance=inf2 --file=src/helperscripts/n2-manifest.json --neuron-version=2.20.1
.. program-output:: python3 src/helperscripts/n2-helper.py --list=packages --instance=inf2 --file=src/helperscripts/n2-manifest.json --neuron-version=2.20.2

Inf1 packages
^^^^^^^^^^^^^^

.. program-output:: python3 src/helperscripts/n2-helper.py --list=packages --instance=inf1 --file=src/helperscripts/n2-manifest.json --neuron-version=2.20.1
.. program-output:: python3 src/helperscripts/n2-helper.py --list=packages --instance=inf1 --file=src/helperscripts/n2-manifest.json --neuron-version=2.20.2

Supported Python Versions for Inf1 packages
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. program-output:: python3 src/helperscripts/n2-helper.py --list=pyversions --instance=inf1 --file=src/helperscripts/n2-manifest.json --neuron-version=2.20.1
.. program-output:: python3 src/helperscripts/n2-helper.py --list=pyversions --instance=inf1 --file=src/helperscripts/n2-manifest.json --neuron-version=2.20.2

Supported Python Versions for Inf2/Trn1 packages
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. program-output:: python3 src/helperscripts/n2-helper.py --list=pyversions --instance=inf2 --file=src/helperscripts/n2-manifest.json --neuron-version=2.20.1
.. program-output:: python3 src/helperscripts/n2-helper.py --list=pyversions --instance=inf2 --file=src/helperscripts/n2-manifest.json --neuron-version=2.20.2

Supported Numpy Versions
^^^^^^^^^^^^^^^^^^^^^^^^
Expand Down
9 changes: 9 additions & 0 deletions release-notes/libneuronxla/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,15 @@ the `PJRT <https://openxla.org/xla/pjrt_integration>`__ runtime, built using
the `PJRT C-API plugin <https://github.com/openxla/xla/blob/5564a9220af230c6c194e37b37938fb40692cfc7/xla/pjrt/c/docs/pjrt_integration_guide.md>`__
mechanism.

Release [2.0.5347.0]
--------------------
Date: 11/20/2024

Summary
~~~~~~~

Add support for torch-xla 2.1.5 which fixes the "list index out of range" error when using the Zero Redundancy Optimizer (ZeRO1) checkpoint loading.

Release [2.0.4986.0]
--------------------
Date: 10/25/2024
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,14 +10,25 @@ NxD Training Release Notes (``neuronx-distributed-training``)

This document lists the release notes for Neuronx Distributed Training library.

.. _neuronx-distributed-rn-1-0-0:
.. _neuronx-distributed-training-rn-1-0-1:

Neuronx Distributed Training [1.0.1]

Date: 11/20/2024

Features in this release
------------------------

* Added support for transformers 4.36.0

.. _neuronx-distributed-training-rn-1-0-0:

Neuronx Distributed Training [1.0.0]

Date: 09/16/2024

Features this release
---------------------
Features in this release
------------------------

This is the first release of NxD Training (NxDT), NxDT is a PyTorch-based library that adds support for user-friendly distributed training experience through a YAML configuration file compatible with NeMo,, allowing users to easily set up their training workflows. At the same time, NxDT maintains flexibility, enabling users to choose between using the YAML configuration file, PyTorch Lightning Trainer, or writing their own custom training script using the NxD Core.
The library supports PyTorch model classes including Hugging Face and Megatron-LM. Additionally, it leverages NeMo's data engineering and data science modules enabling end-to-end training workflows on NxDT, and providing a compatability with NeMo through minimal changes to the YAML configuration file for models that are already supported in NxDT. Furthermore, the functionality of the Neuron NeMo Megatron (NNM) library is now part of NxDT, ensuring a smooth migration path from NNM to NxDT.
Expand Down
15 changes: 15 additions & 0 deletions release-notes/prev/content.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,21 @@ Previous Releases Artifacts (Neuron 2.x)
:local:
:depth: 1

Neuron 2.20.1 (10/25/2024)
---------------------------

Trn1 packages
^^^^^^^^^^^^^
.. program-output:: python3 src/helperscripts/n2-helper.py --list=packages --instance=trn1 --file=src/helperscripts/n2-manifest.json --neuron-version=2.20.1

Inf2 packages
^^^^^^^^^^^^^
.. program-output:: python3 src/helperscripts/n2-helper.py --list=packages --instance=inf2 --file=src/helperscripts/n2-manifest.json --neuron-version=2.20.1

Inf1 packages
^^^^^^^^^^^^^
.. program-output:: python3 src/helperscripts/n2-helper.py --list=packages --instance=inf1 --file=src/helperscripts/n2-manifest.json --neuron-version=2.20.1

Neuron 2.20.0 (09/16/2024)
---------------------------

Expand Down
9 changes: 9 additions & 0 deletions release-notes/runtime/aws-neuronx-dkms/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,15 @@ Updated : 04/29/2022

- In rare cases of multi-process applications running under heavy stress a model load failure my occur. This may require reloading of the Neuron Driver as a workaround.


Neuron Driver release [2.18.20.0]
--------------------------------
Date: 11/20/2024

Bug Fixes
^^^^^^^^^
* This release addresses an issue with Neuron Driver that can lead to a user-space application either gaining access to kernel addresses or providing the driver with spoofed memory handles (kernel addresses) that can be potentially used to gain elevated privileges. We would like to thank `Cossack9989 <https://github.com/Cossack9989>`_ for reporting and collaborating on this issue.

Neuron Driver release [2.18.12.0]
--------------------------------

Expand Down
8 changes: 8 additions & 0 deletions release-notes/runtime/aws-neuronx-runtime-lib/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,14 @@ NEFF Version Runtime Version Range Notes
2.0 >= 1.6.5.0 Starting support for 2.0 NEFFs
============ ===================== ===================================

Neuron Runtime Library [2.22.19.0]
---------------------------------
Date: 11/20/2024

New in this release
^^^^^^^^^^^^^^^^^^^
* Minor improvements and bug fixes

Neuron Runtime Library [2.22.14.0]
---------------------------------
Date: 09/16/2024
Expand Down
10 changes: 10 additions & 0 deletions release-notes/torch/torch-neuronx/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,16 @@ PyTorch Neuron for |Trn1|/|Inf2| is a software package that enables PyTorch
users to train, evaluate, and perform inference on second-generation Neuron
hardware (See: :ref:`NeuronCore-v2 <neuroncores-v2-arch>`).

Release [2.1.2.2.3.2]
----------------------
Date: 11/20/2024

Summary
~~~~~~~

This patch narrows the range of dependent libneuronxla versions to support minor version bumps
and fixes the "list index out of range" error when using the Zero Redundancy Optimizer (ZeRO1) checkpoint loading.

Release [2.1.2.2.3.1]
----------------------
Date: 10/25/2024
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade transformers==4.31.0 optimum-neuron==0.0.8"
"!pip install --upgrade transformers==4.31.0 optimum-neuron==0.0.8 sentencepiece"
]
},
{
Expand Down
62 changes: 56 additions & 6 deletions src/helperscripts/n2-manifest.json
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,13 @@
{"repo_type":"rpm", "repo_url":"https://yum.repos.neuron.amazonaws.com/"},
{"repo_type":"deb", "repo_url":"https://apt.repos.neuron.amazonaws.com/"}
],
"manifest_date": "10/25/2024",
"manifest_version": "2.20.1",
"manifest_date": "11/20/2024",
"manifest_version": "2.20.2",
"latest_release": [
{"instance":"inf1", "version":"2.20.1"},
{"instance":"trn1", "version":"2.20.1"},
{"instance":"inf2", "version":"2.20.1"},
{"instance":"trn1n", "version":"2.20.1"}
{"instance":"inf1", "version":"2.20.2"},
{"instance":"trn1", "version":"2.20.2"},
{"instance":"inf2", "version":"2.20.2"},
{"instance":"trn1n", "version":"2.20.2"}
],
"os_properties": [
{"os":"ubuntu18", "default_python_version":"3.7"},
Expand Down Expand Up @@ -90,6 +90,56 @@
{"name":"jax_neuronx","component":"Jax","category":"jax","package_type":"pip","use_cases":["inference"],"pin_major":"true"}
],
"neuron_releases": [
{"neuron_version":"2.20.2", "packages": [
{"name":"aws-neuronx-collectives","version":"2.22.33.0","supported_instances":["trn1","inf2"],"supported_python_versions":[]},
{"name":"aws-neuronx-collectives","version":"2.12.35.0","supported_instances":["inf1"],"supported_python_versions":[]},
{"name":"aws-neuronx-dkms","version":"2.18.20.0","supported_instances":["inf1","trn1","inf2"],"supported_python_versions":[]},
{"name":"aws-neuronx-gpsimd-customop-lib","version":"0.12.2.0","supported_instances":["trn1","inf2"],"supported_python_versions":[]},
{"name":"aws-neuronx-gpsimd-tools","version":"0.12.1.0","supported_instances":["trn1","inf2"],"supported_python_versions":[]},
{"name":"aws-neuronx-k8-plugin","version":"2.22.20.0","supported_instances":["inf1","trn1","inf2"],"supported_python_versions":[]},
{"name":"aws-neuronx-k8-scheduler","version":"2.22.20.0","supported_instances":["inf1","trn1","inf2"],"supported_python_versions":[]},
{"name":"aws-neuronx-oci-hook","version":"2.5.8.0","supported_instances":["inf1","trn1","inf2"],"supported_python_versions":[]},
{"name":"aws-neuronx-runtime-discovery","version":"2.9","supported_instances":["trn1","inf2"],"supported_python_versions":["3.8","3.9","3.10","3.11"]},
{"name":"aws-neuronx-runtime-lib","version":"2.12.23.0","supported_instances":["inf1"],"supported_python_versions":[]},
{"name":"aws-neuronx-runtime-lib","version":"2.22.19.0","supported_instances":["trn1","inf2"],"supported_python_versions":[]},
{"name":"aws-neuronx-tools","version":"2.19.0.0","supported_instances":["inf1","trn1","inf2"],"supported_python_versions":[]},
{"name":"dmlc_nnvm","version":"1.19.6.0","supported_instances":["inf1"],"supported_python_versions":["3.8","3.9","3.10"]},
{"name":"dmlc_topi","version":"1.19.6.0","supported_instances":["inf1"],"supported_python_versions":["3.8","3.9","3.10"]},
{"name":"dmlc_tvm","version":"1.19.6.0","supported_instances":["inf1"],"supported_python_versions":["3.8","3.9","3.10"]},
{"name":"inferentia_hwm","version":"1.17.6.0","supported_instances":["inf1"],"supported_python_versions":["3.8","3.9","3.10"]},
{"name":"jax_neuronx","version":"0.1.1","supported_instances":["trn1","inf2"],"supported_python_versions":["3.9"]},
{"name":"libneuronxla","version":"2.0.5347.0","supported_instances":["trn1","inf2"],"supported_python_versions":["3.8","3.9","3.10","3.11"]},
{"name":"libneuronxla","version":"0.5.3278","supported_instances":["trn1","inf2"],"supported_python_versions":["3.8","3.9","3.10"]},
{"name":"mx_neuron","version":"1.8.0.2.4.147.0","supported_instances":["inf1"],"supported_python_versions":["3.8","3.9","3.10"]},
{"name":"mxnet_neuron","version":"1.5.1.1.10.0.0","supported_instances":["inf1"],"supported_python_versions":["3.8","3.9","3.10"]},
{"name":"neuron-cc","version":"1.24.0.0","supported_instances":["inf1"],"supported_python_versions":["3.8","3.9","3.10"]},
{"name":"neuronperf","version":"1.8.93.0","supported_instances":["inf1"],"supported_python_versions":["3.8","3.9","3.10"]},
{"name":"neuronx-cc","version":"2.15.143.0","supported_instances":["trn1","inf2"],"supported_python_versions":["3.8","3.9","3.10","3.11"]},
{"name":"neuronx-cc-stubs","version":"2.15.143.0","supported_instances":["trn1","inf2"],"supported_python_versions":["3.8","3.9","3.10"]},
{"name":"neuronx_distributed","version":"0.9.0","supported_instances":["trn1","inf2"],"supported_python_versions":["3.8","3.9","3.10","3.11"]},
{"name":"neuronx_distributed_training","version":"1.0.1","supported_instances":["trn1"],"supported_python_versions":["3.8","3.9","3.10","3.11"]},
{"name":"tensorboard-plugin-neuronx","version":"2.6.63.0","supported_instances":["trn1","inf2"],"supported_python_versions":[]},
{"name":"tensorflow-model-server-neuronx","version":"2.10.1.2.12.2.0","supported_instances":["inf1","trn1","inf2"],"supported_python_versions":[]},
{"name":"tensorflow-model-server-neuronx","version":"2.9.3.2.12.2.0","supported_instances":["inf1","trn1","inf2"],"supported_python_versions":[]},
{"name":"tensorflow-model-server-neuronx","version":"2.8.4.2.12.2.0","supported_instances":["inf1","trn1","inf2"],"supported_python_versions":[]},
{"name":"tensorflow-neuron","version":"2.10.1.2.12.2.0","supported_instances":["inf1"],"supported_python_versions":["3.8","3.9","3.10"]},
{"name":"tensorflow-neuron","version":"2.8.4.2.12.2.0","supported_instances":["inf1"],"supported_python_versions":["3.8","3.9","3.10"]},
{"name":"tensorflow-neuron","version":"2.9.3.2.12.2.0","supported_instances":["inf1"],"supported_python_versions":["3.8","3.9","3.10"]},
{"name":"tensorflow-neuronx","version":"2.10.1.2.1.0","supported_instances":["trn1","inf2"],"supported_python_versions":["3.8","3.9","3.10"]},
{"name":"tensorflow-neuronx","version":"2.8.4.2.1.0","supported_instances":["trn1","inf2"],"supported_python_versions":["3.8","3.9","3.10"]},
{"name":"tensorflow-neuronx","version":"2.9.3.2.1.0","supported_instances":["trn1","inf2"],"supported_python_versions":["3.8","3.9","3.10"]},
{"name":"torch-neuron","version":"1.10.2.2.11.13.0","supported_instances":["inf1"],"supported_python_versions":["3.8","3.9","3.10"]},
{"name":"torch-neuron","version":"1.11.0.2.11.13.0","supported_instances":["inf1"],"supported_python_versions":["3.8","3.9","3.10"]},
{"name":"torch-neuron","version":"1.12.1.2.11.13.0","supported_instances":["inf1"],"supported_python_versions":["3.8","3.9","3.10"]},
{"name":"torch-neuron","version":"1.13.1.2.11.13.0","supported_instances":["inf1"],"supported_python_versions":["3.8","3.9","3.10"]},
{"name":"torch-neuron","version":"1.9.1.2.11.13.0","supported_instances":["inf1"],"supported_python_versions":["3.8","3.9","3.10"]},
{"name":"torch-neuronx","version":"1.13.1.1.16.0","supported_instances":["trn1","inf2"],"supported_python_versions":["3.8","3.9","3.10","3.11"]},
{"name":"torch-neuronx","version":"2.1.2.2.3.2","supported_instances":["trn1","inf2"],"supported_python_versions":["3.8","3.9","3.10","3.11"]},
{"name":"torch_xla","version":"1.13.1+torchneurong","supported_instances":["trn1","inf2"],"supported_python_versions":["3.8","3.9","3.10","3.11"]},
{"name":"torch_xla","version":"2.1.5","supported_instances":["trn1","inf2"],"supported_python_versions":["3.8","3.9","3.10","3.11"]},
{"name":"transformers-neuronx","version":"0.12.313","supported_instances":["trn1","inf2"],"supported_python_versions":["3.8","3.9","3.10","3.11"]},
{"name":"efa-installer","version":"na","supported_instances":["trn1"],"supported_python_versions":[]}
]},
{"neuron_version":"2.20.1", "packages": [
{"name":"aws-neuronx-collectives","version":"2.22.26.0","supported_instances":["trn1","inf2"],"supported_python_versions":[]},
{"name":"aws-neuronx-collectives","version":"2.12.35.0","supported_instances":["inf1"],"supported_python_versions":[]},
Expand Down
2 changes: 2 additions & 0 deletions static/robots.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
User-agent: *

Disallow: /en/v2.20.2/

Disallow: /en/v2.20.1/

Disallow: /en/v2.20.0/
Expand Down

0 comments on commit a574acb

Please sign in to comment.