grail-miner

synapz-org/grail-miner-claude-skill

This skill should be used when setting up, managing, or optimizing Grail miners on Bittensor Subnet 81. Use it for GRAIL protocol tasks including miner setup, R2 storage configuration, model checkpoint management, GRPO rollout generation, performance optimization, competitive monitoring, and troubleshooting common issues like CUDA errors, upload failures, or low scores. Essential for miners working with verifiable post-training, SAT/GSM8K environments, or understanding the GRAIL incentive mechanism to improve competitiveness.

0 stars

0 forks

Python

6 views

View on GitHub Add to Favorites

SKILL.md

name: grail-miner description: This skill should be used when setting up, managing, or optimizing Grail miners on Bittensor Subnet 81. Use it for GRAIL protocol tasks including miner setup, R2 storage configuration, model checkpoint management, GRPO rollout generation, performance optimization, competitive monitoring, and troubleshooting common issues like CUDA errors, upload failures, or low scores. Essential for miners working with verifiable post-training, SAT/GSM8K environments, or understanding the GRAIL incentive mechanism to improve competitiveness.

Grail Miner Skill

Overview

Set up and operate Grail miners to participate in verifiable post-training for language models on Bittensor Subnet 81. Grail implements the GRAIL protocol (Guaranteed Rollout Authenticity via Inference Ledger) for cryptographically verifiable GRPO rollouts on SAT and GSM8K problems, with automatic model evolution through distributed training.

Key Innovation: Grail uses cryptographic proofs to bind rollouts to specific models and inputs, enabling decentralized post-training at internet scale with verifiable contributions and on-chain incentives.

Core Capabilities

1. MINER SETUP WORKFLOW

Prerequisites Check before starting:

OS-agnostic: Any platform (Linux/macOS/Windows) with floating point precision within tolerance
Python 3.11+ with uv package manager
Accelerators recommended (NVIDIA GPU for best throughput, but not required)
Bittensor wallet registered to subnet 81 (mainnet) or 429 (testnet)
Cloudflare R2 bucket (name must match account ID, region ENAM)
Dual R2 credentials: read-only (public, committed on-chain) + write (private, local only)
Optional: WandB account for monitoring

Quick Start (6-Phase Setup):

Clone and Install

git clone https://github.com/one-covenant/grail
cd grail
uv venv && source .venv/bin/activate
uv sync  # Reproducible install with lockfile

Generate Environment Configuration
```
./scripts/setup_miner_env.sh
```
- Interactive wizard for .env generation
- Collects network, wallet, R2 credentials
- Validates bucket configuration
- Creates production-ready .env file
Verify Setup
```
python scripts/check_miner_health.py
```
- Comprehensive health checks
- Validates R2 connectivity (read/write)
- Tests wallet registration
- Checks GPU availability
- Verifies drand beacon access
First Run (Test Mode)
```
grail -vv mine  # Verbose mode for debugging
```
- Commits read credentials on-chain (first run only)
- Downloads latest model checkpoint from R2
- Starts generating rollouts for current window
Monitor Performance
- View logs in terminal for immediate feedback
- Check W&B dashboard: https://wandb.ai/tplr/grail (if enabled)
- Monitor Grafana: https://grail-grafana.tplr.ai/

Production Deployment (Systemd)

sudo tee /etc/systemd/system/grail-miner.service > /dev/null << 'EOF'
[Unit]
Description=Grail Miner
After=network-online.target

[Service]
Type=simple
User=miner
WorkingDirectory=/home/miner/grail
Environment="PATH=/home/miner/grail/.venv/bin:/usr/bin:/bin"
ExecStart=/home/miner/grail/.venv/bin/grail mine
Restart=always
RestartSec=10

[Install]
WantedBy=multi-user.target
EOF

sudo systemctl daemon-reload
sudo systemctl enable grail-miner
sudo systemctl start grail-miner
sudo journalctl -u grail-miner -f

2. R2 STORAGE CONFIGURATION (CRITICAL FOR SUCCESS)

The #1 Issue: Miners struggling with R2 bucket setup and dual-credential configuration.

Dual-Credential Architecture:

WRITE CREDENTIALS (Private)      READ CREDENTIALS (Public)
     ↓                                 ↓
Local .env only              Committed on-chain
Used for uploads             Allows validator fetches
Full read/write              Read-only access

Step-by-Step R2 Setup:

Create Cloudflare R2 Bucket
- Go to https://dash.cloudflare.com → R2
- Click "Create Bucket"
- CRITICAL: Bucket name MUST equal your Account ID
- Set region to ENAM (required)
- Get Account ID: Dashboard → Overview → Copy "Account ID"
Generate Write Credentials (Private)
- Go to R2 → "Manage R2 API Tokens"
- Click "Create API Token"
- Name: "grail-write-access"
- Permissions: Edit (full read/write)
- Scope: Select your bucket
- Copy both Access Key ID and Secret Access Key
Generate Read Credentials (Public)
- Create another API Token
- Name: "grail-read-only"
- Permissions: Read (read-only)
- Scope: Same bucket
- Copy both keys

Configure .env:

# Account & Bucket
R2_ACCOUNT_ID=abc123def456  # Your Cloudflare account ID
R2_BUCKET_ID=abc123def456   # MUST match account ID

# Write credentials (private, never shared)
R2_WRITE_ACCESS_KEY_ID=AKIAXXXXXXXXXXXXXXXX
R2_WRITE_SECRET_ACCESS_KEY=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

# Read credentials (public, posted on-chain)
R2_READ_ACCESS_KEY_ID=AKIAXXXXXXXXXXXXXXXX
R2_READ_SECRET_ACCESS_KEY=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

Verify Connectivity

python scripts/check_miner_health.py
# Should show: ✅ R2 write access verified
#              ✅ R2 read access verified

How Validators Access Miner Data:

Miner commits read credentials to chain on first run
Validators fetch read credentials from metagraph
Validators download miner's window files from R2
Validators verify GRAIL proofs and score rollouts
Validators set weights based on successful rollouts

Common R2 Issues → See Troubleshooting section

3. MODEL CHECKPOINT MANAGEMENT

How Model Evolution Works:

Grail uses a hybrid approach where models start from a base and evolve through training:

Base Model: Qwen/Qwen2.5-7B-Instruct (initial checkpoint)
Window Checkpoints: Trainer uploads new checkpoint after each window
Automatic Loading: Miners download latest checkpoint at window start
R2 Storage: Checkpoints stored in R2 with retention policy
Milestone Checkpoints: Every 100 windows preserved permanently

Miner Checkpoint Workflow (grail/cli/mine.py:156-165):

# At start of each window
window_start = (current_block // WINDOW_LENGTH) * WINDOW_LENGTH
previous_window = window_start - WINDOW_LENGTH

# Download checkpoint from previous window
checkpoint_path = download_checkpoint(previous_window)
model = load_model(checkpoint_path)

# Generate rollouts with this checkpoint
# Upload rollouts to R2

Checkpoint Naming Convention:

checkpoints/
├── window-71950/           # Recent checkpoint
│   ├── model.safetensors
│   ├── config.json
│   └── tokenizer/
├── window-71900/           # Previous window
└── milestone-71800/        # Milestone (every 100)

Configuration (.env):

# Checkpoint retention (default: 10)
GRAIL_CHECKPOINT_RETENTION_LIMIT=10

# Milestone interval (default: 100 windows)
GRAIL_CHECKPOINT_MILESTONE_INTERVAL=100

# Local cache directory
GRAIL_CACHE_DIR=~/.cache/grail

Manual Checkpoint Operations:

# List available checkpoints
aws s3 ls s3://${R2_BUCKET_ID}/checkpoints/ \
  --endpoint-url https://${R2_ACCOUNT_ID}.r2.cloudflarestorage.com

# Download specific checkpoint
python -c "
from grail.infrastructure.comms import download_checkpoint
path = download_checkpoint(window=71950)
print(f'Downloaded to: {path}')
"

# Clear local cache
rm -rf ~/.cache/grail/checkpoints/*

Key Files:

Checkpoint download: grail/infrastructure/comms.py:download_checkpoint()
Model loading: grail/cli/mine.py:156-165
Trainer upload: grail/cli/train.py:upload_checkpoint()

4. GRPO ROLLOUT GENERATION & OPTIMIZATION

What is GRPO?

Group Relative Policy Optimization - a reinforcement learning algorithm that:

Generates multiple rollouts per problem (16 rollouts fixed)
Computes advantages relative to group mean
Optimizes policy using advantage-weighted gradients
Maintains KL divergence from reference model

Rollout Generation Pipeline (grail/environments/loop.py:47-222):

# For each SAT/GSM8K problem:
1. Derive deterministic seed: sha256(block_hash + drand + nonce)
2. Generate problem instance from seed
3. Create GRPO batch (16 rollouts per problem)
4. Generate completions with logprob tracking
5. Parse solutions and compute rewards
6. Calculate advantages (reward - group_mean)
7. Create GRAIL proof (PRF-based commitment)
8. Sign rollout with hotkey
9. Package for upload

Reward Components (grail/environments/reward_components.py):

Total Reward = 0.7*correctness + 0.15*thinking + 0.1*answer + 0.05*no_trailing

- correctness (0.7): SAT solution validity or GSM8K answer correctness
- thinking (0.15): Presence of <start_working_out> tags
- answer (0.1): Presence of <SOLUTION> tags
- no_trailing (0.05): Penalty for text after </SOLUTION>

Performance Optimization:

Batch Size Tuning (.env):

# Number of rollouts to generate in parallel (default: 1)
# Must divide evenly into 16 (valid: 1, 2, 4, 8, 16)
# Higher values = more throughput but more VRAM

GRAIL_GENERATION_BATCH_SIZE=1   # Baseline (lowest memory)
GRAIL_GENERATION_BATCH_SIZE=4   # ~3-4x throughput (recommended for A100)
GRAIL_GENERATION_BATCH_SIZE=16  # ~10x throughput (H100/H200 144GB)

Generation Parameters (hardcoded in constants):

Max new tokens: 1024
Rollouts per problem: 16
Temperature: 1.0 (for diversity)
Top-p: 0.95

Monitor Generation Performance:

# Watch real-time metrics
grail -vv mine

# Key metrics to watch:
# - Generation time per batch
# - Upload time per window
# - Rollout success rate
# - GPU memory usage (nvidia-smi)

Key Files:

Rollout generator: grail/mining/rollout_generator.py
Environment loop: grail/environments/loop.py
SAT environment: grail/environments/sat_env.py
GSM8K environment: grail/environments/gsm8k_env.py

5. COMPETITIVE MONITORING & SCORING

Understanding the Incentive Mechanism:

Validators score miners based on unique successful rollouts over recent windows using a superlinear curve:

# Scoring formula (grail/scoring/scorer.py)
for each miner:
    valid_rollouts = count_verified_rollouts(miner, window)
    unique_solutions = count_unique_correct_solutions(miner, window)

    # Superlinear reward curve
    raw_score = (unique_solutions ** 1.5) * valid_rollouts

    # Normalize across all miners
    weight = raw_score / sum(all_raw_scores)

What Matters for High Scores:

Rollout Validity (GRAIL verification)
- Correct token-level proofs
- Valid signatures
- Proper commitment/opening
Solution Correctness (SAT/GSM8K)
- SAT: Assignments must satisfy all clauses
- GSM8K: Final answer must match ground truth
Solution Diversity
- Unique solutions earn more than duplicates
- Explore different solution paths
Volume
- More valid rollouts = higher weight
- Maximize throughput within window

Monitoring Your Competitiveness:

WandB Dashboard (https://wandb.ai/tplr/grail):

# Enable in .env
GRAIL_MONITORING_BACKEND=wandb
WANDB_API_KEY=your_key
WANDB_PROJECT=grail
WANDB_ENTITY=tplr  # Public project

# Metrics tracked:
# - rollout_count: Total rollouts generated
# - upload_success_rate: Upload reliability
# - generation_time_avg: Throughput metric
# - reward_mean: Average reward per rollout

Grafana Dashboard (https://grail-grafana.tplr.ai/):

Real-time logs from all miners
Network-wide statistics
Validator performance

On-Chain Weights (btcli):

# Check your current weight
btcli subnet metagraph --netuid 81 --subtensor.network finney | grep $(cat ~/.bittensor/wallets/default/hotkeys/miner/ss58_address.txt)

# Compare to top miners
btcli subnet metagraph --netuid 81 --subtensor.network finney | sort -k4 -rn | head -20

Performance Analysis:

# Analyze your rollouts locally
from grail.scoring.scorer import compute_miner_scores

# Load your window data
window_data = load_window_rollouts(window_start)

# Compute metrics
valid_count = sum(1 for r in window_data if r['valid'])
success_count = sum(1 for r in window_data if r['success'])
unique_solutions = len(set(r['solution'] for r in window_data if r['success']))

print(f"Valid: {valid_count}/total")
print(f"Successful: {success_count}/{valid_count}")
print(f"Unique solutions: {unique_solutions}")

Improvement Strategies:

Increase Throughput
- Tune GRAIL_GENERATION_BATCH_SIZE
- Upgrade GPU (H100/H200 for 10x gains)
- Optimize upload timing
Improve Success Rate
- Monitor reward components
- Check model checkpoint version
- Verify problem difficulty range
Maximize Diversity
- Use higher temperature if allowed
- Generate across different problem seeds
- Explore varied reasoning paths

Key Files:

Scoring logic: grail/scoring/scorer.py
Window aggregation: grail/cli/validate.py:compute_window_scores()
Metrics tracking: grail/shared/logging.py

6. TROUBLESHOOTING COMMON ISSUES

CUDA / GPU Errors

Symptom: CUDA out of memory or GPU not detected

RuntimeError: CUDA out of memory. Tried to allocate X.XX GiB

Solutions:

Reduce batch size:
```
export GRAIL_GENERATION_BATCH_SIZE=1
```
Clear GPU cache periodically (miner does this automatically):
```
import torch
torch.cuda.empty_cache()
```

Check GPU availability:

nvidia-smi
python -c "import torch; print(torch.cuda.is_available())"

Verify CUDA compatibility:

nvidia-smi | grep "CUDA Version"
# Should be >= 12.0 for best performance

Note: Grail is OS and hardware-agnostic - GPU is recommended for throughput but not required.

R2 Upload Failures

Symptom: Upload errors or "No uploads" warnings

ERROR: Failed to upload window rollouts to R2
ERROR: Credentials invalid or bucket not found

Solutions:

Verify credentials:

python scripts/check_miner_health.py
# Should show ✅ for both read and write access

Check bucket configuration:

# Bucket name MUST equal account ID
echo "Account: $R2_ACCOUNT_ID"
echo "Bucket: $R2_BUCKET_ID"
# These should match!

Test manual upload:

aws s3 ls s3://${R2_BUCKET_ID}/ \
  --endpoint-url https://${R2_ACCOUNT_ID}.r2.cloudflarestorage.com \
  --profile grail-write

Verify region is ENAM:
- Go to Cloudflare dashboard → R2 → Click bucket
- Region should show "Eastern North America (ENAM)"

Low Scores / No Weights

Symptom: Not receiving weights from validators

INFO: Window complete, 0 successful rollouts
WARNING: No weights received for 3+ windows

Diagnostic Steps:

Check rollout validity:

# Enable verbose logging
grail -vv mine

# Look for:
# ✅ GRAIL proof valid
# ✅ Signature verified
# ✅ Solution correct

Verify uploads succeeded:

# List your window files on R2
aws s3 ls s3://${R2_BUCKET_ID}/windows/ \
  --endpoint-url https://${R2_ACCOUNT_ID}.r2.cloudflarestorage.com

# Should see: {hotkey}-window-{block}.json

Check read credentials on-chain:

# Validators need your read credentials
btcli subnet metagraph --netuid 81 | grep $(cat ~/.bittensor/wallets/default/hotkeys/miner/ss58_address.txt)

# Should show your endpoint and committed credentials

Monitor validator logs (Grafana):
- Visit https://grail-grafana.tplr.ai/
- Search for your hotkey
- Check for verification errors

Compare to checkpoint version:

# Ensure you're using latest checkpoint
ls -lh ~/.cache/grail/checkpoints/
# Should show recent window number

Common Causes:

Read credentials not committed (first run required)
Bucket name ≠ account ID
Wrong region (must be ENAM)
Model checkpoint too old
GRAIL proof failures
Low throughput (not generating enough rollouts)

Drand Beacon Failures

Symptom: Cannot fetch randomness beacon

WARNING: Drand fetch failed, falling back to block hash
ERROR: All drand endpoints unreachable

Solutions:

Miner automatically falls back to block-hash only (safe)

Test drand connectivity:

python -c "
from grail.infrastructure.drand import get_drand_beacon
beacon = get_drand_beacon()
print(f'Beacon: {beacon}')
"

Use explicit fallback mode:
```
grail mine --no-drand
```

Check firewall rules (drand uses HTTPS):

curl -I https://api.drand.sh/public/latest

Note: Block-hash fallback is safe and deterministic - validators use same seed derivation.

Wallet / Registration Issues

Symptom: Wallet not found or not registered

ERROR: Wallet 'default/miner' not found
ERROR: Hotkey not registered on subnet 81

Solutions:

Verify wallet exists:

ls ~/.bittensor/wallets/
# Should show your coldkey name

ls ~/.bittensor/wallets/default/hotkeys/
# Should show your hotkey name

Check registration:

btcli wallet overview --wallet.name default --wallet.hotkey miner
# Should show registration on subnet 81

btcli subnet register \
  --wallet.name default \
  --wallet.hotkey miner \
  --netuid 81 \
  --subtensor.network finney

Verify .env matches wallet names:

grep WALLET .env
# BT_WALLET_COLD=default
# BT_WALLET_HOT=miner

Protocol Deep Dive

GRAIL Cryptographic Proof (grail/protocol/):

1. Challenge Derivation:
   seed = sha256(drand_randomness || block_hash || window_context)

2. PRF-Based Commitment:
   For each token t:
     - Generate random vector r_t = PRF(seed, position)
     - Compute sketch commitment: s_t = dot(token_vec, r_t) mod PRIME_Q

3. Verifier Challenge:
   - Validator samples K=16 random positions
   - Requests token IDs and proofs at those positions

4. Verification:
   - Recompute r_t from seed and position
   - Check: s_t == dot(token_vec, r_t) mod PRIME_Q
   - Verify signatures bind to hotkey

SAT Problem Determinism (grail/environments/sat_env.py):

# Deterministic generation from seed
def generate_sat_problem(seed: int, difficulty: int):
    rng = random.Random(seed)  # Deterministic RNG
    n_vars = 3 + difficulty  # 3-10 variables
    n_clauses = 5 + difficulty * 2  # 5-20 clauses

    clauses = []
    for _ in range(n_clauses):
        clause = rng.sample(range(1, n_vars+1), k=3)
        clause = [v if rng.random() > 0.5 else -v for v in clause]
        clauses.append(clause)

    return clauses

Reward Calculation (grail/environments/reward_components.py:64-116):

# Multi-component reward vector
def compute_reward(completion: str, problem: Problem):
    parsed = parse_completion(completion)

    # Component rewards
    r_correctness = check_solution(parsed.solution, problem)  # 0.7 weight
    r_thinking = 0.5 if has_thinking_tags(parsed) else 0.0   # 0.15 weight
    r_answer = 0.3 if has_solution_tags(parsed) else 0.0     # 0.1 weight
    r_concise = max(0, 0.2 - 0.001*trailing_chars(parsed))   # 0.05 weight

    total = (0.7*r_correctness + 0.15*r_thinking +
             0.1*r_answer + 0.05*r_concise)
    return total  # Range: [0.0, 1.0]

Key Configuration Reference

Critical Environment Variables (.env):

# Network
BT_NETWORK=finney              # mainnet (or 'test' for testnet)
NETUID=81                      # Grail subnet

# Wallet
BT_WALLET_COLD=default         # Your coldkey name
BT_WALLET_HOT=miner            # Your hotkey name

# R2 Storage (CRITICAL: bucket name = account ID, region = ENAM)
R2_ACCOUNT_ID=abc123           # Cloudflare account ID
R2_BUCKET_ID=abc123            # MUST match account ID
R2_WRITE_ACCESS_KEY_ID=...     # Private write credentials
R2_WRITE_SECRET_ACCESS_KEY=...
R2_READ_ACCESS_KEY_ID=...      # Public read credentials (on-chain)
R2_READ_SECRET_ACCESS_KEY=...

# Performance
GRAIL_GENERATION_BATCH_SIZE=4  # Parallel rollouts (1/2/4/8/16)

# Monitoring (Optional)
GRAIL_MONITORING_BACKEND=wandb
WANDB_API_KEY=...
WANDB_PROJECT=grail
WANDB_ENTITY=tplr              # Public project

Constants (grail/shared/constants.py):

WINDOW_LENGTH = 50              # Blocks per scoring window
BLOCK_TIME_SECONDS = 12         # Target block time
ROLLOUTS_PER_PROBLEM = 16       # Fixed rollouts per problem
CHALLENGE_K = 16                # Positions verified per rollout
PRIME_Q = 2_147_483_647        # Modulus for sketch commitments

Resources

scripts/

setup_miner_env.sh - Interactive .env generation wizard
check_miner_health.py - Comprehensive health check script

references/

grail_protocol.md - Deep dive into GRAIL cryptographic protocol
incentive_mechanism.md - Detailed scoring and weight computation
environments.md - SAT and GSM8K environment specifications
performance_tuning.md - Advanced optimization strategies

External Resources

Covenant AI: https://www.covenant.ai (Grail's parent company)
Discord Community: https://discord.gg/GyzhzRWJBQ (support and discussions)
GitHub Repository: https://github.com/one-covenant/grail
Miner Docs: https://github.com/one-covenant/grail/blob/main/docs/miner.md
Validator Docs: https://github.com/one-covenant/grail/blob/main/docs/validator.md
W&B Dashboard: https://wandb.ai/tplr/grail (public metrics)
Grafana Logs: https://grail-grafana.tplr.ai/ (real-time monitoring)

README

Grail Miner Claude Skill

Expert guidance for setting up, managing, and optimizing Grail miners on Bittensor Subnet 81.

What is Grail?

Grail is a Bittensor subnet implementing verifiable post-training for language models. It uses the GRAIL protocol (Guaranteed Rollout Authenticity via Inference Ledger) to cryptographically bind GRPO rollouts to specific models and inputs, enabling decentralized training at internet scale with on-chain incentives.

Key Features:

🔐 Cryptographic Verification: PRF-based commitments prove rollout authenticity
🤖 Model Evolution: Automatic checkpoint updates through distributed training
🎯 Multiple Environments: SAT and GSM8K problem solving
📊 Superlinear Rewards: Incentivizes solution diversity over volume
⚡ OS-Agnostic: Any platform with floating point precision within tolerance

What This Skill Provides

This Claude Code skill gives you expert capabilities for:

1. Complete Miner Setup

Interactive environment configuration wizard
Comprehensive health check script
Step-by-step setup guidance with best practices
Production deployment with systemd

2. R2 Storage Mastery

Dual-credential architecture (private write + public read)
Bucket configuration and validation
Troubleshooting upload failures
Cost optimization strategies

3. Model Checkpoint Management

Understanding checkpoint evolution
Automatic loading and caching
Manual operations and cleanup
Version tracking

4. Performance Optimization

Batch size tuning for maximum throughput
GPU memory optimization
Window timing strategies
Cost-effective configurations

5. Competitive Monitoring

Understanding the incentive mechanism
Tracking your scores and weights
WandB and Grafana dashboard usage
Improvement strategies

6. Expert Troubleshooting

CUDA/GPU errors
R2 upload failures
Low scores and no weights
Drand beacon issues
Wallet and registration problems

Installation

Prerequisites

Claude Code installed and configured
Python 3.11+ with uv package manager (for running scripts)

Installing the Skill

Option 1: From ZIP file

# In Claude Code
/skills install grail-miner-claude-skill.zip

Option 2: From directory

# Copy to your skills directory
cp -r grail-miner-claude-skill ~/.claude/skills/

# Or in Claude Code
/skills add /path/to/grail-miner-claude-skill

Verify Installation

# In Claude Code
/skills list

# You should see: grail-miner

Usage Examples

Setting Up a New Miner

You: Help me set up a Grail miner for subnet 81

Claude: I'll guide you through setting up your Grail miner. Let me start by using the
setup wizard to generate your .env configuration...

[Walks through R2 bucket creation, credential generation, wallet configuration]

Now let's run the health check to verify everything is configured correctly...

[Runs health check and diagnoses any issues]

Your setup looks good! Here's how to start mining...

Troubleshooting Upload Failures

You: My miner isn't uploading rollouts to R2. Getting "credentials invalid" errors.

Claude: Let me help diagnose your R2 upload issue. This is typically caused by one of these:

1. Bucket name not matching account ID
2. Wrong credentials or permissions
3. Incorrect region (must be ENAM)

Let's check your configuration:

[Runs through diagnostic steps]

I found the issue - your bucket is in WNAM but should be ENAM. Here's how to fix it...

Optimizing Performance

You: How can I increase my miner's throughput? I have an H100 80GB GPU.

Claude: Great! With an H100 80GB, you can significantly increase throughput by tuning your
batch size. Let me help you optimize:

Current setting: GRAIL_GENERATION_BATCH_SIZE=1 (baseline)
Optimal for H100 80GB: GRAIL_GENERATION_BATCH_SIZE=8

This will give you approximately 6-7x throughput improvement...

[Provides step-by-step tuning guide with monitoring]

Understanding Low Scores

You: Why am I not receiving weights? My miner has been running for 3 windows.

Claude: Let me help diagnose why you're not receiving weights. Let's check several things:

1. Are your rollouts being uploaded successfully?
2. Are your read credentials committed on-chain?
3. Are your rollouts passing GRAIL verification?
4. Is your model checkpoint up to date?

[Walks through diagnostic checklist using logs and on-chain data]

I found the issue - your read credentials haven't been committed on-chain yet...

Skill Structure

grail-miner-claude-skill/
├── SKILL.md                 # Main skill definition and workflows
├── README.md                # This file
├── scripts/
│   ├── setup_miner_env.sh   # Interactive .env generation wizard
│   └── check_miner_health.py # Comprehensive health check
└── references/
    ├── grail_protocol.md            # Deep dive into GRAIL cryptography
    ├── incentive_mechanism.md       # Scoring and rewards explained
    ├── environments.md              # SAT and GSM8K specifications
    └── performance_tuning.md        # Advanced optimization strategies

Skill Capabilities

This skill is automatically triggered when you ask about:

Setup: "set up grail miner", "configure R2 storage", "create .env"
Troubleshooting: "upload failures", "CUDA errors", "low scores", "no weights"
Optimization: "increase throughput", "batch size tuning", "GPU optimization"
Monitoring: "check my weight", "view scores", "WandB dashboard"
Protocol: "GRAIL verification", "how scoring works", "incentive mechanism"
Checkpoints: "model evolution", "checkpoint management", "cache cleanup"

Scripts Reference

setup_miner_env.sh

Interactive wizard for generating production-ready .env files.

Usage:

cd /path/to/grail
./path/to/scripts/setup_miner_env.sh

Features:

Network and wallet configuration
R2 bucket and dual-credential setup
Performance tuning (batch size)
WandB monitoring configuration
Validates inputs and provides warnings

check_miner_health.py

Comprehensive health check for miner setup.

Usage:

cd /path/to/grail
python /path/to/scripts/check_miner_health.py

Checks:

✅ Environment variables
✅ Wallet existence and registration
✅ R2 write and read access
✅ GPU availability
✅ Drand beacon connectivity
✅ Python dependencies

Output: Color-coded pass/fail summary with diagnostic details

External Resources

Covenant AI: https://www.covenant.ai (Grail's parent company)
Discord Community: https://discord.gg/GyzhzRWJBQ (support and discussions)
Grail GitHub: https://github.com/one-covenant/grail
Miner Documentation: https://github.com/one-covenant/grail/blob/main/docs/miner.md
W&B Dashboard: https://wandb.ai/tplr/grail (public metrics)
Grafana Logs: https://grail-grafana.tplr.ai/ (real-time monitoring)

Contributing

Found an issue or have a suggestion? Please report it on the Grail GitHub issues page or reach out on Discord.

License

This skill is provided as-is for use with Claude Code. The Grail project itself is open source - see the GitHub repository for license details.

Version: 1.0.0 Compatible With: Grail subnet 81 (mainnet), 429 (testnet) Last Updated: October 2025

Installation

Option 1: Use slash command in Claude Code

/install-skill https://github.com/synapz-org/grail-miner-claude-skill

Option 2: Clone to skills directory

# Global (all projects)

git clone https://github.com/synapz-org/grail-miner-claude-skill ~/.claude/skills/grail-miner-claude-skill

# Project-specific

git clone https://github.com/synapz-org/grail-miner-claude-skill .claude/skills/grail-miner-claude-skill

Add MCP server to .cursor/mcp.json:

{
  "mcpServers": {
    "skillz": {
      "command": "npx",
      "args": ["-y", "skillz-mcp", "https://github.com/synapz-org/grail-miner-claude-skill"]
    }
  }
}

Restart Cursor after adding the configuration.

Option 1: Use Gemini CLI command

gemini extensions install https://github.com/synapz-org/grail-miner-claude-skill

Option 2: Clone to extensions directory

git clone https://github.com/synapz-org/grail-miner-claude-skill ~/.gemini/extensions/grail-miner-claude-skill

Topics

ai bittensor claude-code grail mining