dialectical-loop

chipfox/dialetic-agents

Run a bounded adversarial cooperation coding loop (Architect -> Player <-> Coach). Use when implementing features from REQUIREMENTS.md, generating SPECIFICATION.md plans, and iterating with strict review until approved.

0 stars
0 forks
Python
13 views

SKILL.md


name: dialectical-loop description: Run a bounded adversarial cooperation coding loop (Architect -> Player <-> Coach). Use when implementing features from REQUIREMENTS.md, generating SPECIFICATION.md plans, and iterating with strict review until approved. version: 1.0.3 tags: [coding, multi-agent, code-review, automation, dialectical, python, llm, workflow, github-copilot] author: chipfox repository: https://github.com/chipfox/dialetic-agents license: MIT

Dialectical Loop

Run a bounded, adversarial coding workflow rather than single-turn “vibe coding”.

Core idea

  • Architect turns high-level intent (REQUIREMENTS.md) into an actionable contract (SPECIFICATION.md).
  • Player implements the specification and runs verification commands.
  • Coach adversarially evaluates compliance and blocks until it’s correct.

This keeps attention bounded, forces explicit plans, and adds a strict review gate.

Inputs / outputs

  • Input: REQUIREMENTS.md (recommended)
  • Generated/optional input: SPECIFICATION.md
  • Output: code edits + command outputs per turn

Agent prompts (no manual install)

This skill ships its role prompts inside the skill folder:

  • agents/architect.md
  • agents/player.md
  • agents/coach.md

The orchestrator loads these files automatically at runtime. You do not need to “install agents” separately or configure them in OpenSkills.

Key Features

  • Bounded iteration: --max-turns prevents runaway token usage
  • Fast-fail optimization: Skip Coach if verification fails (save tokens)
  • Auto-context switching: Full snapshot turn 1 → git-changed thereafter
  • Token-optimized: --lean-mode activates all savings (fast-fail, coach-focus-recent, auto-fix)
  • Production observability: JSON logs with token estimates, loop health metrics, inter-agent communication tracking

Prerequisites

  • Python 3.10+
  • GitHub CLI authenticated (gh auth login or GITHUB_TOKEN set)
  • GitHub Copilot CLI available as copilot command

Essential Options

  • --max-turns N — Bound the loop (required)

  • --lean-mode — Activate all token optimizations (recommended)

  • --verbose — Detailed debug output

  • --quiet — Minimal output (final summary only)

  • --skip-architect — Use existing SPECIFICATION.md

  • --architect-model, --player-model, --coach-model — Override defaults

Recommended Models (GitHub Copilot CLI)

Tier 1 (Balanced): claude-sonnet-4.5 for all roles (~38 units/5 turns) Tier 2 (Budget): gemini-3-pro-preview Architect, claude-haiku-4.5 Player, claude-sonnet-4.5 Coach (~26 units) Tier 3 (Premium): claude-opus-4.5 Architect/Coach, claude-sonnet-4.5 Player (~79 units)

Observability

Automatic JSON logs (dialectical-loop-TIMESTAMP.json) include:

  • Per-turn events (agent, model, tokens, duration)
  • Loop health metrics (zero-edit streaks, fast-fail spirals)
  • Inter-agent communication tracking (feedback coverage, error persistence)
  • Real-time warnings for stuck patterns

Usage Examples


# Basic usage
python ~/.claude/skills/dialectical-loop/scripts/dialectical_loop.py --max-turns 10

# Token-optimized
python ~/.claude/skills/dialectical-loop/scripts/dialectical_loop.py --max-turns 10 --lean-mode

# With existing spec
python ~/.claude/skills/dialectical-loop/scripts/dialectical_loop.py --skip-architect --max-turns 5

Complete Documentation

See README.md for:

  • Installation instructions
  • Complete options reference
  • Model selection guide (3 tiers with cost analysis)
  • Token-saving strategies
  • Dynamic spec pruning
  • Verification configuration
  • Troubleshooting
  • Cross-platform shell configuration

README

dialectical-loop (OpenSkills)

An adversarial cooperation coding loop (Architect → Player ↔ Coach) inspired by the Dialectical Autocoding workflow. This repo is structured as an installable OpenSkills skill.

What you get

  • Architect generates/refreshes a detailed SPECIFICATION.md (when missing)
  • Player implements changes and runs commands/tests
  • Coach reviews for strict compliance and rejects until complete
  • Bounded turns via --max-turns

Prerequisites

  • Python 3.10+ (3.12 works)

Default provider: GitHub Copilot CLI

This repo’s orchestrator is implemented against the GitHub Copilot CLI:

  • GitHub CLI authenticated: gh auth login (or GITHUB_TOKEN set)
  • GitHub Copilot CLI available as copilot (your environment must be able to run it)

Agent prompts (built-in)

You do not install “agents” manually. The role prompts are included in this repo and are loaded automatically by the script:

  • agents/architect.md
  • agents/player.md
  • agents/coach.md

Install (OpenSkills)

From your own terminal after you publish this repo:

  • openskills install <your-github-url>
  • Confirm: openskills list
  • Read into your agent: openskills read dialectical-loop

Note: openskills primarily prints skill instructions for an agent to follow; it does not provide a universal openskills run command.

Run the loop

In the project directory you want to modify:

  1. Create REQUIREMENTS.md (or provide an existing SPECIFICATION.md).
  2. Run the installed script:
  • macOS/Linux example:
    • python ~/.claude/skills/dialectical-loop/scripts/dialectical_loop.py --max-turns 10
  • Windows example:
    • python %USERPROFILE%/.claude/skills/dialectical-loop/scripts/dialectical_loop.py --max-turns 10

Useful flags

  • --requirements-file REQUIREMENTS.md
  • --spec-file SPECIFICATION.md
  • --skip-architect
  • --quiet (minimal output)
  • --verbose (debug output)
  • --command-shell {auto,powershell,cmd,wsl} (Windows: helps when commands are PowerShell vs bash; auto prefers PowerShell and uses WSL for Unix-like commands)

Token-saving flags

  • --lean-mode: Recommended. Activates all token-saving features (--fast-fail, --coach-focus-recent, --auto-fix, --context-mode auto).
  • --context-mode {auto,snapshot,git-changed}
    • auto (default): snapshot on turn 1, then only git-changed files
  • --context-max-bytes N, --context-max-file-bytes N, --context-max-files N
  • --coach-focus-recent: Restrict Coach context to only files edited in the current turn (saves tokens).
  • --fast-fail: Skip Coach review if verification commands fail (saves tokens/time).
  • --auto-fix: Automatically run npm run lint -- --fix (or similar) if available after Player edits.

Dynamic Spec Pruning

The Player agent is instructed to automatically remove completed sections from SPECIFICATION.md (or mark them [DONE]) as it progresses. This keeps the context window small and prevents re-reading completed instructions.

Verification flags

  • --verify-cmd "<command>" (repeatable)
  • --no-auto-verify
  • Automatic LSP: The script auto-detects npm run build, npm run typecheck, or tsc to provide type errors.

Model selection (Architect, Player, Coach)

This skill works with any Copilot CLI model. Choose a tier based on your budget and quality needs:

Tier 1: Recommended (Balanced)

--architect-model claude-sonnet-4.5 \
--player-model gemini-3-pro-preview \
--coach-model claude-sonnet-4.5

Cost: ~38 units per 5-turn loop. Stable, proven, good reasoning.

Tier 2: Budget (Cost-optimized)

--architect-model gemini-3-pro-preview \
--player-model claude-haiku-4.5 \
--coach-model claude-sonnet-4.5

Cost: ~26 units (32% cheaper). Good for rapid iteration; test Haiku on your codebase first.

Tier 3: Premium (Quality-optimized)

--architect-model claude-opus-4.5 \
--player-model claude-sonnet-4.5 \
--coach-model claude-opus-4.5

Cost: ~79 units (2.1x Tier 1). Use for mission-critical systems; Opus 4.5 is Preview (test first).

Why these choices?

  • Architect needs strong reasoning and large context (spec generation is cognitively demanding)
  • Player focuses on code generation; Haiku is fast & cheap for most code tasks
  • Coach must be a credible critic; keep it strong (reason parity with Architect)

Observability: Know What Your Loop Is Doing

This skill includes built-in observability to prevent token waste and give you confidence the loop is functioning correctly.

How observability works

Every time you run the script, it automatically writes a JSON observability log to your project directory (filename like dialectical-loop-20251213-143115.json). This log captures:

  • Per-turn breakdown: agent (Architect/Player/Coach), model used, action, tokens (estimated), outcome, duration.
  • Summary stats: total turns, total tokens, approval/rejection counts, any errors.
  • Alerts: if a loop gets stuck rejecting, or tokens are unexpectedly high.

Output modes

# Quiet: minimal output (only summary + log file path)
python ~/.claude/skills/dialectical-loop/scripts/dialectical_loop.py --max-turns 10 --quiet

# Normal (default): one-line-per-turn feedback + summary
python ~/.claude/skills/dialectical-loop/scripts/dialectical_loop.py --max-turns 10

# Verbose: detailed debugging output (includes snippets)
python ~/.claude/skills/dialectical-loop/scripts/dialectical_loop.py --max-turns 10 --verbose

Interpreting logs

Open the generated JSON file to diagnose loop health:

{
  "summary": {
    "total_turns_executed": 3,
    "total_tokens_estimated": 12500,
    "coach_calls": {
      "approved": 1,
      "rejected": 1
    }
  }
}

What to watch for:

  • High token count: May indicate spec is too verbose; try refining REQUIREMENTS.md.
  • Many Coach rejections: Player is misunderstanding the spec; check SPECIFICATION.md clarity.
  • Errors in logs: Review stderr output + the error field in the JSON.

Using other LLM providers (OpenAI, Anthropic, Gemini, Azure, Bedrock, local)

This skill does not require any non-GitHub API keys by default, because it does not call Claude/OpenAI/Gemini APIs directly.

If you want to run the same dialectical loop against another provider, you’ll need to modify the backend used to obtain LLM responses in scripts/dialectical_loop.py (the get_llm_response(...) function). Once you do that, credentials become provider-specific:

  • OpenAI / OpenRouter / compatible: typically OPENAI_API_KEY (and possibly a custom base URL).
  • Anthropic (Claude API): typically ANTHROPIC_API_KEY.
  • Google Gemini: commonly GOOGLE_API_KEY (or Application Default Credentials depending on your setup).
  • Azure OpenAI: typically AZURE_OPENAI_API_KEY plus an endpoint and deployment name.
  • AWS Bedrock: AWS credentials/IAM (environment variables or configured profiles).
  • Local models (Ollama/LM Studio): usually no API key, but a local server must be running.

The --player-model / --coach-model / --architect-model flags are currently passed to the Copilot CLI as-is; if you switch providers, you can reinterpret those flags in your backend implementation.

Repo layout

  • SKILL.md — OpenSkills entrypoint
  • scripts/dialectical_loop.py — orchestrator
  • agents/*.md — prompts for Architect/Player/Coach (and other roles, if you expand later)
  • REQUIREMENTS.example.md — example task file (not used by the loop unless you copy it)

Troubleshooting: File write / permission issues (Windows)

If the Player agent or orchestrator reports file write failures, it is often due to OS-level protections on Windows. Common causes:

  • Controlled Folder Access (Windows Security > Ransomware protection) blocking Python or Node from writing to synced folders.
  • Repo located under OneDrive / Desktop / Documents which may have special protection or sync locks.
  • Files or directories marked read-only or owned by another user.

Quick checks and mitigations:

  1. Run the built-in write diagnostics:
python scripts\dialectical_loop.py --check-writes

This prints a short probe including current user, parent directory mode, and a "quick write probe" result.

  1. Check whether your repo is under OneDrive or a synced folder:
echo $env:OneDrive
echo $env:OneDriveConsumer
(Get-Item -Path .).FullName
  1. Inspect ACLs if needed:
icacls .\
  1. If Controlled Folder Access is blocking writes, either allow-list python.exe and node.exe in Ransomware protection or move the repo to a non-protected path (e.g., C:\dev\YourRepo).

If the diagnostics output indicates a PermissionError during the quick write probe, capture the output and open an issue including that text so we can help further.