Megatron-LM
NVIDIA/Megatron-LMOngoing research training transformer models at scale
16,615 stars
4,043 forks
Python
80 views