-
Notifications
You must be signed in to change notification settings - Fork 327
Pull requests: NVIDIA/TransformerEngine
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Update list of CI users
testing
Improvements to tests or testing infrastructure
#1340
opened Nov 15, 2024 by
timmoon10
Loading…
8 of 14 tasks
[Common] Moved framework agnostic THD kernels to common.
#1339
opened Nov 15, 2024 by
mgoldfarb-nvidia
Loading…
8 of 13 tasks
Debug nightly docs
documentation
Improvements or additions to documentation
testing
Improvements to tests or testing infrastructure
#1338
opened Nov 15, 2024 by
timmoon10
Loading…
4 of 13 tasks
[PyTorch] Store module extra state in tensor
bug
Something isn't working
#1335
opened Nov 15, 2024 by
timmoon10
Loading…
8 of 13 tasks
[PyTorch] Integration test for Megatron-LM
1.13.0
bug
Something isn't working
#1329
opened Nov 13, 2024 by
timmoon10
Loading…
9 of 14 tasks
[COMMON/JAX] Support sliding window on THD format
#1327
opened Nov 11, 2024 by
zlsh80826
Loading…
6 of 13 tasks
TP communication overlap: enable the overlap between GEMM chunk at Ho…
#1311
opened Nov 4, 2024 by
erhoo82
Loading…
1 of 13 tasks
Improving communication overlap for the case of multi kernel queue usage
#1308
opened Nov 2, 2024 by
youngeunkwon0405
Loading…
13 tasks
[PyTorch] Add heuristics for intializing FP8 params
enhancement
New feature or request
#1300
opened Oct 30, 2024 by
timmoon10
Loading…
8 of 13 tasks
[PyTorch] Fix get_swa_mask() for padding masks
#1281
opened Oct 21, 2024 by
cyanguwa
Loading…
6 of 13 tasks
attention_mask fill with -inf for UnfusedDotProductAttention
#1268
opened Oct 18, 2024 by
Agoniii
Loading…
1 of 13 tasks
Draft: reduce cudagraph mem via preoallcations
#1253
opened Oct 15, 2024 by
JimmyZhang12
Loading…
13 tasks
Save CUDA Graph memory by reusing input and output tensors
#1234
opened Oct 9, 2024 by
buptzyb
Loading…
5 of 13 tasks
Previous Next
ProTip!
Updated in the last three days: updated:>2024-11-12.