Skip to content

NVIDIA Neural Modules 1.7.0

Compare
Choose a tag to compare
@ericharper ericharper released this 02 Mar 00:57
· 3284 commits to main since this release
256236f

Known Issues

  • Megatron GPT training with O2 and FP16 is bugged. FP16 and O1 still works.
  • find_unused_parameters should be False when training GPT: #3837
  • FastPitch training may result in stalled GPUs. Users will have to manually kill their runs and continue training from the latest checkpoint.
  • mT5 issue with whole word masking, see #3776
  • T5 finetuning config issue, see #3776

Container

NOTE: From NeMo 1.7.0 onwards, NeMo containers will follow the YY.MM conversion for naming, where the YY.MM value is based on the base container. For additional information regarding NeMo containers, please visit : https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo

docker pull nvcr.io/nvidia/nemo:22.01

ASR

TTS

  • port UnivNet to NeMo TTS collection by @L0SG :: PR: #3186
  • E2E TTS fixes by @redoctopus :: PR: #3508
  • New structure for TTS datasets in scripts/dataset_processing, VocoderDataset, update TTSDataset by @Oktai15 :: PR: #3484
  • Depreciate some TTS models and TTS datasets by @Oktai15 :: PR: #3576
  • Fix bugs in HiFi-GAN (scheduler, optimizers) and add input_example() in Mixer-TTS/Mixer-TTS-X by @Oktai15 :: PR: #3564
  • Update UnivNet, HiFi-GAN and WaveGlow, small fixes in Mixer-TTS, FastPitch and Exportable by @Oktai15 :: PR: #3585
  • Fix typo in FastPitch config (pitch_avg -> pitch_mean) by @eyentei :: PR: #3593
  • Fix incorrect usage of TTSDataset in some files and fix one-line bug in NVIDIA's CMUDict by @Oktai15 :: PR: #3594
  • Convert entry from UTF-16 to UTF-8 by @redoctopus :: PR: #3597
  • remove CheckInstall by @blisc :: PR: #3577
  • Fix UnivNet LibriTTS pretrained location by @m-toman :: PR: #3615
  • FastPitch training tutorial by @subhankar-ghosh :: PR: #3631
  • Update Aligner, add new methods to AlignmentEncoder by @Oktai15 :: PR: #3641
  • Add Mixed Representation Training by @blisc :: PR: #3473
  • Add speakerID to libritts/get_data.py by @subhankar-ghosh :: PR: #3662
  • Update TTS tutorials, Simplification of testing Mixer-TTS and FastPitch by @Oktai15 :: PR: #3680
  • Clean FastPitch_Finetuning.ipynb notebook by @Oktai15 :: PR: #3698
  • Add cache_size to BetaBinomialInterpolator, fix bugs in TTS tutorials and FastPitch by @Oktai15 :: PR: #3706
  • Fix bugs in VocoderDataset and TTSDataset by @Oktai15 :: PR: #3713
  • Fix bugs in E2E TTS, Mixer-TTS and FastPitch by @Oktai15 :: PR: #3740

NLP / NMT

Text Normalization / Inverse Text Normalization

Export

Bugfixes

  • Text normalization takes too much time for a string which contains a lot of dates by @PeganovAnton :: PR: #3451
  • Dialogue state tracking refactor/ SGDGEN patch 2 by @Zhilin123 :: PR: #3674
  • lower bound PTL to 1.5.10 and remove last ckpt patch fix by @nithinraok :: PR: #3690

Improvements