Megatron-LM
NVIDIA/Megatron-LMOngoing research training transformer models at scale
16,553 stars
4,023 forks
Python
52 views