ray-train

Distributed training orchestration across clusters. Scales PyTorch/TensorFlow/HuggingFace from laptop to 1000s of nodes. Built-in hyperparameter tuning with Ray Tune, fault tolerance, elastic scaling. Use when training massive models across multiple machines or running distributed hyperparameter sweeps.

$ インストール

git clone https://github.com/zechenzhangAGI/AI-research-SKILLs /tmp/AI-research-SKILLs && cp -r /tmp/AI-research-SKILLs/08-distributed-training/ray-train ~/.claude/skills/AI-research-SKILLs

// tip: Run this command in your terminal to install the skill