Releases: dmlc/gluon-nlp
v0.10.0 Maintenance Release
This release includes the following fixes:
- [BUGFIX] remove wd from squad (#1223)
- Fix deprecation warnings due to invalid escape sequences. (#1219)
- Fix layer_norm_eps in BERTEncoder (#1214)
- [BUGFIX] Fix vocab determinism in py35 (#1166) (#1167)
As we prepare for the NumPy-based GluonNLP development, we are making the following adjustments to the branch usage:
- master (old) -> v0.x: this branch will be used for maintenance of GluonNLP 0.x versions.
- numpy -> master: the new master branch will be used for GluonNLP 1.0 onward with NumPy-compatible interface, based on the upcoming MXNet 2.0.
v0.9.2: Bug Fix
v0.9.1: Bug Fix
v0.9.0: BERT Inference Time Cut by Half and 90% Scaling Efficiency for Distributed Training
News
- GluonNLP was featured in EMNLP 2019 Hong Kong! Check out the code accompanying the tutorial.
- "GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing" has been published in the Journal of Machine Learning Research.
Models and Scripts in v0.9
BERT
INT8 Quantization for BERT Sentence Classification and Question Answering
(#1080)! Also Check out the blog post.
Enhancements to the pretraining script (#1121, #1099) and faster tokenizer for
BERT (#921, #1024) as well as multi-GPU support for SQuAD fine-tuning (#1079).
Make BERT a HybridBlock (#877).
XLNet
The XLNet model introduced by Yang, Zhilin, et. al in
"XLNet: Generalized Autoregressive Pretraining for Language Understanding".
The model was converted from the original repository (#866).
GluonNLP further provides scripts for finetuning XLNet on the Glue (#995) and
SQuAD datasets (#1130) that reproduce the authors results. Check out the usage.
DistilBERT
The DistilBERT model introduced by Sanh, Victor, et. al in
"DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter" (#922).
Transformer
Add a separate Transformer inference script to make inference easy and make it
convenient to analysis the performance of transformer inference (#852).
Korean BERT
Pre-trained Korean BERT is available as part of GluonNLP (#1057)
RoBERTa
GluonNLP now provides scripts for finetuning RoBERTa (#931).
GPT2
GPT2 is now a HybridBlock the model can be exported for running from other MXNet
language bindings (#1010).
New Features
- Add NamedTuple + Dict batchify (#959)
- Add even_size option to split sampler (#1028)
- Add length normalized metrics for machine translation tasks (#1095)
- Add raw attention scores to the AttentionCell #951 (#964)
- Add round_to feature to BERT & XLNet finetuning scripts (#1133)
- Add stratified train_valid_split similar to sklearn.model_selection.train_test_split (#933)
- Add SuperGlue dataset API (#858)
- Add Multi Model Server deployment code example for developers (#1140)
- Allow custom dropout, number of layers/units for BERT (#950)
- Avoid race condition when downloading vocab (#1078)
- Deprecate specifying Vocab padding, bos and eos_token as positional arguments (#945)
- Fast multitensor adam optimizer (#1111)
- Faster grad_global_norm for clipping (#1115)
- Hybridizable AWDRNN/StandardRNN (#911)
- Padding seq length to multiple of 8 in BERT model (#909)
- Scripts for producing the figures that explain the bucketing strategy (#908)
- Split up Seq2SeqDecoder in Seq2SeqDecoder and Seq2SeqOneStepDecoder (#976)
- Switch CI to Python 3.5 and declare Python 3.5 support (#1009)
- Try to use the new None feature in MXNet + Drop support for MXNet 1.5 (#967)
- Use fused gelu operator (#1082)
- Use softmax with length, and interleaved matmul for BERT (#1136)
- Documentation of Model Conversion Scripts at https://gluon-nlp.mxnet.io/v0.9.x/model_zoo/conversion_tools/index.html (#922)
Bug Fixes and code cleanup
- Add version checker to all scripts (#930)
- Add version checker to all tutorials (#934)
- Add 'packaging' to requirements (#1143)
- Adjust code owner (#923)
- Avoid using dict for attention cell parameter creation (#1050)
- Bump version in preparation for 0.9 release (#987)
- Change SimVerb3500 URL to aclweb hosted version (#979)
- Correct propagation of error codes in GluonNLP-py3-master-gpu-doc (#971)
- Corrected np.random.randint upper limit in data.stream.py (#935)
- Declare Python version requirement in setup.py (#927)
- Declare more optional dependencies (#958)
- Declare pytest seed marker in pytest.ini (#940)
- Disable HybridBeamSearch (#1021)
- Drop LAMB optimizer from GluonNLP in favor of MXNet version (#1116)
- Drop unused compatibility helpers and fix doc (#928)
- Fix #905 (#906)
- Fix a SQuAD 2.0 evaluation bug (#907)
- Fix argument
analogy-max-vocab-size
(#904) - Fix broken multi-head attention cell (#878)
- Fix bugs in BERT export script (#944)
- Fix chnsenticorp dataset download link (#873)
- Fix file sampler for BERT (#977)
- Fix index.rst and gpu flag in machine translation (#952)
- Fix log in finetune_squad.py (#1001)
- Fix parameter sharing of WeightDropParameter (#1083)
- Fix scripts/question_answering/data_pipeline.py requiring optional package (#1013)
- Fix the weight tie and weight sharing for AWDRNN (#1087)
- Fix training command in Language Modeling index.rst (#1100)
- Fix version check in train_gnmt.py and train_transformer.py (#1003)
- Fix standard rnn weight sharing error (#1122)
- Glue data preprocessing pipeline and bert & xlnet scripts (#1031)
- Improve Vocab.repr if reserved_tokens or unknown_token is None (#989)
- Improve readability (#975)
- Improve test robustness (#960)
- Improve the readability of the training script. This fix replaces magic numbers with the name (#1006)
- Make EmbeddingCenterContextBatchify returned dtype robust to empty sentences (#954)
- Modify the log average loss (#1103)
- Move ICSL script out of BERT folder (#1131)
- Move NER script out of bert folder (#1090)
- Move ParallelBigRNN into nlp.model namespace (#1118)
- Move get_rnn_cell out of seq2seq_encoder_decoder (#1073)
- Mxnet version check (#1063)
- Refactor BERT with new data preprocessing (#1124)
- Remove NLTKMosesTokenizer in favor of SacreMosesTokenizer (#942)
- Remove extra dropout in BERT/RoBERTa (#1022)
- Remove outdated comment (#943)
- Remove padding warning (#916)
- Replace unicode comma with ascii comma (#1056)
- Split up inheritance structure of TransformerEncoder and BERTEncoder (#988)
- Support int32 for sampled blocks (#1106)
- Switch batch jobs to use G4dn.2x instance (#1041)
- TransformerXL LayerNorm eps and XLNet pretrained model config (#1005)
- Unify BERT horovod and kvstore pre-training script (#889)
- Update README.rst (#884)
- Update data_api.rst (#893)
- Update embedding script (#1046)
- Update fp16_utils.py (#1037)
- Update index.rst (#876)
- Update index.rst (#891)
- Update navbar install (#983)
- Update numba dependency in setup.py (#941)
- Update outdated contributor list (#963)
- Update prepare_clean_env.sh (#998)
Documentation
- Add comment to BERT notebook (#1026)
- Add missing docs for nlp.utils (#936)
- Add more documentation to XLNet scripts (#985)
- Add section for "Clone the master branch for development" (#1075)
- Add to toc tree depth to enable multiple level menu (#1108)
- Cite source of pretrained parameters for bert_12_768_12 (#915)
- Doc fix for vocab.subwords (#885)
- Enhance vocab not found err msg (#917)
- Fix command line examples for text classification (#874)
- Fix math formula in docs (#920)
- More detailed doc for CorpusBPTTBatchify (#888)
- Release checklist (#890)
- Remove non-existent arguments for BERT and Transformer (#946)
- Remove py3 usage from the doc (#1077)
- Update installation guide with selectors (#966)
- Update mxnet version in installation doc (#1072)
- Update pre-trained model link (#1117)
- Update Installation instructions for source (#1146)
Continuous Integration
- Disable SimVerb test for 14 days (#953)
- Disable horovod test temporarily (#1030)
- Disable known bad mxnet nightly version (#997)
- Enable integration tests on CPU (#957)
- Enable testing warnings with pytest and update deprecated API invocations (#980)
- Enable timestamp in CI (#925)
- Enable type checks and inference with pytype (#1018)
- Fix CI (#875)
- Preserve stderr and stdout streams in doc CI stage for Cloudwatch (#882)
- Remove skip_master feature (#1017)
- Switch source of MXNet nightly build (#1058)
- Test MXNet 1.6 pre-release as part of CI pipeline (#1023)
- Update MXNet master version tested on CI (#1113)
- Update numba (#1096)
- Use Cuda 10.0 MXNet build (#991)
v0.8.3: Minor Bug Fixes
- Add int32 support for importance sampling (
model.ISDense
) and noise contrastive estimation (model.NCEDense
).
v0.8.2: Bug Fixes
This release covers a few fixes for the bugs reported:
- Fixed argument passing in the
bert/embedding.py
script - Updated
SimVerb3500
dataset URL to the aclweb hosted version - Removed multi-processing in DataLoader from in
bert/pretraining_utils.py
which potentially causes crash when horovod mpi is used for training - Before MXNet 1.6.0, Gluon
Trainer
assumes deterministic parameter creation order for distributed traiing. The attention cell for BERT and transformer has a non-deterministic parameter creation order in v0.8.1 and v0.8.0, which will cause divergence during distributed training. It is now fixed.
Note that since v0.8.2, the default branch of gluon-nlp github will be switched to the latest stable branch, instead of the master branch under development.
v0.8.1
News
- GluonNLP was featured in KDD 2019 Alaska! Check out our tutorial: From Shallow to Deep Language Representations: Pre-training, Fine-tuning, and Beyond.
- GluonNLP 0.8.1 will no longer support Python 2. (#721, #838)
- Interested in BERT int8 quantization for deployment? Check out the blog post here.
Models and Scripts
RoBERTa
- The RoBERTa model introduced by Yinhan Liu, et. al in "RoBERTa: A Robustly Optimized BERT Pretraining Approach". The model checkpoints are converted from the original repository. Check out the usage here. (#870)
Transformer-XL
- The Transformer-XL model introduced by Zihang Dai, et. al in "Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context". (#846)
Bug Fixes
- Fixed hybridization for the BERT model (#877)
- Change the variable model to bert_classifier (#828) thank you @LindenLiu
- Revert "Add axis argument to squeeze()" (#857)
- [BUGFIX] Remove incorrect vocab.padding_token requirement in CorpusBPTTBatchify
- [BUGFIX] Fix Vocab with unknown_token remapped to != 0 via token_to_idx arg (#862)
- [BUGFIX] Fix AMP in finetune_classifier.py (#848)
- [BUGFIX] fix broken multi-head attention cell (#878) @ZiyueHuang
- [FIX] fix chnsenticorp dataset download link (#873)
- fix the usage of pad in bert (#850)
Documentation
- Clarify Bert does not require MXNet nightly anymore (#860)
- [DOC] fix broken links (#833)
- [DOC] Update BERT index.rst (#844)
- [DOC] Add GluonCV/NLP archive (#823)
- [DOC] add missing dataset document (#832)
- [DOC] remove wrong tutorial header level (#826)
- [DOC] Fix a typo in attention_cell's docstring (#841) thank you @shenfei
- [DOC] Upgrade mxnet dependency to 1.5.0 and use Cuda 10.1 on CI (#842)
- Remove Py2 icon from Readme. Add 3.7 (#856)
- [DOC] Improve help message (#855) thank you @apeforest
- Update index.rst (#853)
- [DOC] Fix Machine Translation with Transformers example (#865)
- update button style (#869)
- [DOC] doc fix for vocab.subwords (#885) thank you @liusy182
Continuous Integration
v0.8.0
News
- GluonNLP is featured in KDD 2019 Alaska! Check out our tutorial: From Shallow to Deep Language Representations: Pre-training, Fine-tuning, and Beyond.
- GluonNLP 0.8.0 will no longer support Python 2. #721
Models
RoBERTa
Transformer-XL
- Transformer-XL is now available in GluonNLP language model zoo. #846
v0.7.1
News
- GluonNLP will be featured in KDD 2019 Alaska! Check out our tutorial: From Shallow to Deep Language Representations: Pre-training, Fine-tuning, and Beyond.
- GluonNLP was featured in JSALT 2019 in Montreal, 2019-6-14! Checkout https://jsalt19.mxnet.io.
- This is the last release in GluonNLP that will officially support Python 2. #721
Models and Scripts
BERT
- a BERT BASE model pre-trained on a large corpus including OpenWebText Corpus, BooksCorpus, and English Wikipedia, which has comparable performance with the BERT large model from Google. The test score on GLUE Benchmark is reported below. Also improved usability of the BERT pre-training script: on-the-fly training data generation, sentencepiece, horovod, etc. (#799, #687, #806, #669, #665). Thank you @davisliang @vanyacohen @Skylion007
Source | GluonNLP | google-research/bert | google-research/bert |
---|---|---|---|
Model | bert_12_768_12 | bert_12_768_12 | bert_24_1024_16 |
Dataset | openwebtext_book_corpus_wiki_en_uncased |
book_corpus_wiki_en_uncased |
book_corpus_wiki_en_uncased |
SST-2 | 95.3 | 93.5 | 94.9 |
RTE | 73.6 | 66.4 | 70.1 |
QQP | 72.3 | 71.2 | 72.1 |
SQuAD 1.1 | 91.0/84.4 | 88.5/80.8 | 90.9/84.1 |
STS-B | 87.5 | 85.8 | 86.5 |
MNLI-m/mm | 85.3/84.9 | 84.6/83.4 | 86.7/85.9 |
-
The SciBERT model introduced by Iz Beltagy and Arman Cohan and Kyle Lo in "SciBERT: Pretrained Contextualized Embeddings for Scientific Text". The model checkpoints are converted from the original repository from AllenAI with the following datasets (#735):
scibert_scivocab_uncased
scibert_scivocab_cased
scibert_basevocab_uncased
scibert_basevocab_cased
-
The BioBERT model introduced by Lee, Jinhyuk, et al. in "BioBERT: a pre-trained biomedical language representation model for biomedical text mining". The model checkpoints are converted from the original repository with the following datasets (#735):
biobert_v1.0_pmc_cased
biobert_v1.0_pubmed_cased
biobert_v1.0_pubmed_pmc_cased
biobert_v1.1_pubmed_cased
-
The ClinicalBERT model introduced by Kexin Huang and Jaan Altosaar and Rajesh Ranganath in "ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission". The model checkpoints are converted from the original repository with the
clinicalbert_uncased
dataset (#735) -
The ERNIE model introduced by Sun, Yu, et al. in "ERNIE: Enhanced Representation through Knowledge Integration". You can get the model checkpoints converted from the original repository with
model.get_model("ernie_12_768_12", "baidu_ernie_uncased")
(#759) thanks @paperplanet -
BERT fine-tuning script for named entity recognition on CoNLL2003 with test F1 92.2 (#612).
-
BERT fine-tuning script for Chinese XNLI dataset with 78.3% validation accuracy. (#759) thanks @paperplanet
-
BERT fine-tuning script for intent classification and slot labelling on ATIS (95.9 F1) and SNIPS (95.9 F1). (#817)
GPT-2
- The GPT-2 language model introduced by Radford, Alec, et al. in "Language Models are Unsupervised Multitask Learners". The model checkpoints are converted from the original repository, with a script to generate text from GPT-2 model (
gpt2_117m
,gpt2_345m
) trained on theopenai_webtext
dataset (#761).
ESIM
- The ESIM model for text matching introduced by Chen, Qian, et al. in "Enhanced LSTM for Natural Language Inference". (#689)
Data
- Natural language understanding with datasets from the GLUE benchmark: CoLA, SST-2, MRPC, STS-B, MNLI, QQP, QNLI, WNLI, RTE (#682)
- Sentiment analysis datasets: CR, MPQA (#663)
- Intent classification and slot labeling datasets: ATIS and SNIPS (#816)
New Features
- [Feature] support save model / trainer states to S3 (#700)
- [Feature] support load model/trainer states from s3 (#702)
- [Feature] Add SentencePieceTokenizer for BERT (#669)
- [FEATURE] Flexible vocabulary (#732)
- [API] Moving MaskedSoftmaxCELoss and LabelSmoothing to model API (#754) thanks @ThomasDelteil
- [Feature] add the List batchify function (#812) thanks @ThomasDelteil
- [FEATURE] Add LAMB optimizer (#733)
Bug Fixes
- [BUGFIX] Fixes for BERT embedding, pretraining scripts (#640) thanks @Deseaus
- [BUGFIX] Update hash of wiki_cn_cased and wiki_multilingual_cased vocab (#655)
- fix bert forward call parameter mismatch (#695) thanks @paperplanet
- [BUGFIX] Fix mlm_loss reporting for eval dataset (#696)
- Fix _get_rnn_cell (#648) thanks @MarisaKirisame
- [BUGFIX] fix mrpc dataset idx (#708)
- [bugfix] fix hybrid beam search sampler(#710)
- [BUGFIX] [DOC] Update nlp.model.get_model documentation and get_model API (#734)
- [BUGFIX] Fix handling of duplicate special tokens in Vocabulary (#749)
- [BUGFIX] Fix TokenEmbedding serialization with
emb[emb.unknown_token] != 0
(#763) - [BUGFIX] Fix glue test result serialization (#773)
- [BUGFIX] Fix init bug for multilevel BiLMEncoder (#783) thanks @Ishitori
API Changes
- [API] Dropping support for wiki_multilingual and wiki_cn (#764)
- [API] Remove get_bert_model from the public API list (#767)
Enhancements
- [FEATURE] offer load_w2v_binary method to load w2v binary file (#620)
- [Script] Add inference function for BERT classification (#639) thanks @TaoLv
- [SCRIPT] - Add static BERT base export script (for use with MXNet Module API) (#672)
- [Enhancement] One script to export bert for classification/regression/QA (#705)
- [enhancement] refactor bert finetuning script (#692)
- [Enhancement] only use the best model for inference for bert classification (#716)
- [Dataset] redistribute conll2004 (#719)
- [Enhancement] add periodic evaluation for BERT pre-training (#720)
- [FEATURE]add XNLI task (#717)
- [refactor] Refactor BERT script folder (#744)
- [Enhancement] BERT pre-training data generation from sentencepiece vocab (#743)
- [REFACTOR] Refactor TokenEmbedding to reduce number of places that initialize internals (#750)
- [Refactor] Refactor BERT SQuAD inference code (#758)
- [Enhancement] Fix dtype conversion, add sentencepiece support for SQuAD (#766)
- [Dataset] Move MRPC dataset to API (#780)
- [BiDAF-QANet] Common data processing logic for BiDAF and QANet (#739) thanks @Ishitori
- [DATASET] add LCQMC, ChnSentiCorp dataset (#774) thanks @paperplanet
- [Improvement] Implement parser evaluation in Python (#772)
- [Enhancement] Add whole word masking for BERT (#770) thanks @basicv8vc
- [Enhancement] Mix precision support for BERT finetuning (#793)
- Generate BERT training samples in compressed format (#651)
Minor Fixes
- Various documentation fixes: #635, #637, #647, #656, #664, #667, #670, #676, #678, #681, #698, #704, #731, #745, #762, #771, #746, #778, #800, #810, #807 #814 thanks @rongruosong @crcrpar @mrchypark @xwind-h
- Fix BERT multiprocessing data creation bug which causes unnecessary dispatching to single worker (#649)
- [BUGFIX] Update BERT test and pre-train script (#661)
- update url for ws353 (#701)
- bump up version (#742)
- [DOC] Update textCNN results (#737)
- padding value warning (#747)
- [TUTORIAL][DOC] Tutorial Updates (#802) thanks @faramarzmunshi
Continuous Integration
- skip failing tests in mxnet master (#685)
- [CI] update nodes for CI (#686)
- [CI] CI refactoring to speed up tests (#566)
- [CI] fix codecov (#693)
- use fixture for squad dataset tests (#699)
- [CI] create zipped notebooks for link check (#712)
- Fix test infrastructure for pytest > 4 and bump CI pytest version (#728)
- [CI] set root in BERT tests (#738)
- Fix conftest.py function_scope_seed (#748)
- [CI] Fix links in contribute.rst (#752)
- [CI] Update CI dependencies (#756)
- Revert "[CI] Update CI dependencies (#756)" (#769)
- [CI] AWS Batch serverless CI Pipeline for parallel notebook execution during website build step (#791)
- [CI] Don't exit pipeline before displaying AWS Batch logfiles (#801)
- [CI] Fix for "Don't exit pipeline before displaying AWS Batch logfile (#803)
- add license checker (#804)
- enable timeout (#813)
- Fix website build on master branch (#819)
v0.7.0
News
- GluonNLP will be featured in KDD 2019 Alaska! Check out our tutorial: From Shallow to Deep Language Representations: Pre-training, Fine-tuning, and Beyond.
- GluonNLP was featured in JSALT 2019 in Montreal, 2019-6-14! Checkout https://jsalt19.mxnet.io.
Models and Scripts
BERT
- BERT model pre-trained on OpenWebText Corpus, BooksCorpus, and English Wikipedia. The test score on GLUE Benchmark is reported below. Also improved usability of the BERT pre-training script: on-the-fly training data generation, sentencepiece, horovod, etc. (#799, #687, #806, #669, #665). Thank you @davisliang
Source | GluonNLP | google-research/bert | google-research/bert |
---|---|---|---|
Model | bert_12_768_12 | bert_12_768_12 | bert_24_1024_16 |
Dataset | openwebtext_book_corpus_wiki_en_uncased |
book_corpus_wiki_en_uncased |
book_corpus_wiki_en_uncased |
SST-2 | 95.3 | 93.5 | 94.9 |
RTE | 73.6 | 66.4 | 70.1 |
QQP | 72.3 | 71.2 | 72.1 |
SQuAD 1.1 | 91.0/84.4 | 88.5/80.8 | 90.9/84.1 |
STS-B | 87.5 | 85.8 | 86.5 |
MNLI-m/mm | 85.3/84.9 | 84.6/83.4 | 86.7/85.9 |
-
The SciBERT model introduced by Iz Beltagy and Arman Cohan and Kyle Lo in "SciBERT: Pretrained Contextualized Embeddings for Scientific Text". The model checkpoints are converted from the original repository from AllenAI with the following datasets (#735):
scibert_scivocab_uncased
scibert_scivocab_cased
scibert_basevocab_uncased
scibert_basevocab_cased
-
The BioBERT model introduced by Lee, Jinhyuk, et al. in "BioBERT: a pre-trained biomedical language representation model for biomedical text mining". The model checkpoints are converted from the original repository with the following datasets (#735):
biobert_v1.0_pmc_cased
biobert_v1.0_pubmed_cased
biobert_v1.0_pubmed_pmc_cased
biobert_v1.1_pubmed_cased
-
The ClinicalBERT model introduced by Kexin Huang and Jaan Altosaar and Rajesh Ranganath in "ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission". The model checkpoints are converted from the original repository with the
clinicalbert_uncased
dataset (#735) -
The ERNIE model introduced by Sun, Yu, et al. in "ERNIE: Enhanced Representation through Knowledge Integration". You can get the model checkpoints converted from the original repository with
model.get_model("ernie_12_768_12", "baidu_ernie_uncased")
(#759) thanks @paperplanet -
BERT fine-tuning script for named entity recognition on CoNLL2003 with test F1 92.2 (#612).
-
BERT fine-tuning script for Chinese XNLI dataset with 78.3% validation accuracy. (#759) thanks @paperplanet
-
BERT fine-tuning script for intent classification and slot labelling on ATIS (95.9 F1) and SNIPS (95.9 F1). (#817)
GPT-2
- The GPT-2 language model introduced by Radford, Alec, et al. in "Language Models are Unsupervised Multitask Learners". The model checkpoints are converted from the original repository, with a script to generate text from GPT-2 model (
gpt2_117m
,gpt2_345m
) trained on theopenai_webtext
dataset (#761).
ESIM
- The ESIM model for text matching introduced by Chen, Qian, et al. in "Enhanced LSTM for Natural Language Inference". (#689)
Data
- Natural language understanding with datasets from the GLUE benchmark: CoLA, SST-2, MRPC, STS-B, MNLI, QQP, QNLI, WNLI, RTE (#682)
- Sentiment analysis datasets: CR, MPQA (#663)
- Intent classification and slot labeling datasets: ATIS and SNIPS (#816)
New Features
- [Feature] support save model / trainer states to S3 (#700)
- [Feature] support load model/trainer states from s3 (#702)
- [Feature] Add SentencePieceTokenizer for BERT (#669)
- [FEATURE] Flexible vocabulary (#732)
- [API] Moving MaskedSoftmaxCELoss and LabelSmoothing to model API (#754) thanks @ThomasDelteil
- [Feature] add the List batchify function (#812) thanks @ThomasDelteil
- [FEATURE] Add LAMB optimizer (#733)
Bug Fixes
- [BUGFIX] Fixes for BERT embedding, pretraining scripts (#640) thanks @Deseaus
- [BUGFIX] Update hash of wiki_cn_cased and wiki_multilingual_cased vocab (#655)
- fix bert forward call parameter mismatch (#695) thanks @paperplanet
- [BUGFIX] Fix mlm_loss reporting for eval dataset (#696)
- Fix _get_rnn_cell (#648) thanks @MarisaKirisame
- [BUGFIX] fix mrpc dataset idx (#708)
- [bugfix] fix hybrid beam search sampler(#710)
- [BUGFIX] [DOC] Update nlp.model.get_model documentation and get_model API (#734)
- [BUGFIX] Fix handling of duplicate special tokens in Vocabulary (#749)
- [BUGFIX] Fix TokenEmbedding serialization with
emb[emb.unknown_token] != 0
(#763) - [BUGFIX] Fix glue test result serialization (#773)
- [BUGFIX] Fix init bug for multilevel BiLMEncoder (#783) thanks @Ishitori
API Changes
- [API] Dropping support for wiki_multilingual and wiki_cn (#764)
- [API] Remove get_bert_model from the public API list (#767)
Enhancements
- [FEATURE] offer load_w2v_binary method to load w2v binary file (#620)
- [Script] Add inference function for BERT classification (#639) thanks @TaoLv
- [SCRIPT] - Add static BERT base export script (for use with MXNet Module API) (#672)
- [Enhancement] One script to export bert for classification/regression/QA (#705)
- [enhancement] refactor bert finetuning script (#692)
- [Enhancement] only use the best model for inference for bert classification (#716)
- [Dataset] redistribute conll2004 (#719)
- [Enhancement] add periodic evaluation for BERT pre-training (#720)
- [FEATURE]add XNLI task (#717)
- [refactor] Refactor BERT script folder (#744)
- [Enhancement] BERT pre-training data generation from sentencepiece vocab (#743)
- [REFACTOR] Refactor TokenEmbedding to reduce number of places that initialize internals (#750)
- [Refactor] Refactor BERT SQuAD inference code (#758)
- [Enhancement] Fix dtype conversion, add sentencepiece support for SQuAD (#766)
- [Dataset] Move MRPC dataset to API (#780)
- [BiDAF-QANet] Common data processing logic for BiDAF and QANet (#739) thanks @Ishitori
- [DATASET] add LCQMC, ChnSentiCorp dataset (#774) thanks @paperplanet
- [Improvement] Implement parser evaluation in Python (#772)
- [Enhancement] Add whole word masking for BERT (#770) thanks @basicv8vc
- [Enhancement] Mix precision support for BERT finetuning (#793)
- Generate BERT training samples in compressed format (#651)
Minor Fixes
- Various documentation fixes: #635, #637, #647, #656, #664, #667, #670, #676, #678, #681, #698, #704, #731, #745, #762, #771, #746, #778, #800, #810, #807 #814 thanks @rongruosong @crcrpar @mrchypark @xwind-h
- Fix BERT multiprocessing data creation bug which causes unnecessary dispatching to single worker (#649)
- [BUGFIX] Update BERT test and pre-train script (#661)
- update url for ws353 (#701)
- bump up version (#742)
- [DOC] Update textCNN results (#737)
- padding value warning (#747)
- [TUTORIAL][DOC] Tutorial Updates (#802) thanks @faramarzmunshi
Continuous Integration
- skip failing tests in mxnet master (#685)
- [CI] update nodes for CI (#686)
- [CI] CI refactoring to speed up tests (#566)
- [CI] fix codecov (#693)
- use fixture for squad dataset tests (#699)
- [CI] create zipped notebooks for link check (#712)
- Fix test infrastructure for pytest > 4 and bump CI pytest version (#728)
- [CI] set root in BERT tests (#738)
- Fix conftest.py function_scope_seed (#748)
- [CI] Fix links in contribute.rst (#752)
- [CI] Update CI dependencies (#756)
- Revert "[CI] Update CI dependencies (#756)" (#769)
- [CI] AWS Batch serverless CI Pipeline for parallel notebook execution during website build step (#791)
- [CI] Don't exit pipeline before displaying AWS Batch logfiles (#801)
- [CI] Fix for "Don't exit pipeline before displaying AWS Batch logfile (#803)
- add license checker (#804)
- enable timeout (#813)
- Fix website build on master branch (#819)