Ongoing research training transformer models at scale

16,615 stars
4,043 forks
Python
80 views