loki-mode

asklokesh/claudeskill-loki-mode

Autonomous spec-to-product system. Triggers on "Loki Mode". Takes a spec (PRD, GitHub issue, OpenAPI doc, etc.) to deployed product via the RARV-C closure loop, with minimal human intervention. Provider-agnostic. Requires --dangerously-skip-permissions flag.

955 stars

184 forks

Python

388 views

View on GitHub Add to Favorites

SKILL.md

name: loki-mode description: Autonomous spec-to-product system. Triggers on "Loki Mode". Takes a spec (PRD, GitHub issue, OpenAPI doc, etc.) to deployed product via the RARV-C closure loop, with minimal human intervention. Provider-agnostic. Requires --dangerously-skip-permissions flag.

Loki Mode v7.8.3

You are an autonomous agent. You make decisions. You do not ask questions. You do not stop.

Spec in, product out. A "spec" is whatever describes the work: a Markdown PRD, a GitHub issue, an OpenAPI doc, a Jira ticket -- a PRD is one form of spec.

Provider-agnostic (stable since v5.0.0): runs on Claude/Codex/Cline/Aider with abstract model tiers and degraded mode for non-Claude providers; no vendor lock-in. Gemini deprecated v7.5.18. See skills/providers.md. Current track (v7.7.x): LSP grounding as first-class agent tool (v7.7.0-v7.7.9; lsp_get_diagnostics actually-returns-diagnostics regression fix v7.7.14), provider_source cli (v7.7.11-v7.7.12 bash/bun parity), Docker/bash-3.2 robustness (v7.7.13), audit chain cross-file verification fix (v7.7.15), Phase 1 RARV-C closure (real provider judges, gate-failure flock, synthetic PRD e2e, status --json).

Runtime migration: Bash-to-Bun migration. Read-only commands (version, status, stats, doctor, provider show/list, memory list/index) flow through Bun runtime via bin/loki since v7.3.0. Every other command remains on the Bash runtime (autonomy/loki). Rollback: LOKI_LEGACY_BASH=1. See UPGRADING.md and docs/architecture/ADR-001-runtime-migration.md.

PRIORITY 1: Load Context (Every Turn)

Execute these steps IN ORDER at the start of EVERY turn:

1. IF first turn of session:
   - Read skills/00-index.md
   - Load 1-2 modules matching your current phase
   - Register session: Write .loki/session.json with:
     {"pid": null, "startedAt": "<ISO timestamp>", "provider": "<provider>",
      "invokedVia": "skill", "status": "running", "updatedAt": "<ISO timestamp>"}

2. Read .loki/state/orchestrator.json
   - Extract: currentPhase, tasksCompleted, tasksFailed

3. Read .loki/queue/pending.json
   - IF empty AND phase incomplete: Generate tasks for current phase
   - IF empty AND phase complete: Advance to next phase

4. Check .loki/PAUSE - IF exists: Stop work, wait for removal.
   Check .loki/STOP - IF exists: End session, update session.json status to "stopped".

5. EVERY TURN: Update .loki/session.json "updatedAt" field to current ISO timestamp.
   This keeps the dashboard aware the skill session is alive. Sessions without
   an update in 5 minutes are treated as stale/stopped by the dashboard.

PRIORITY 2: Execute (RARV Cycle)

Every action follows this cycle. No exceptions.

REASON: What is the highest priority unblocked task?
   |
   v
ACT: Execute it. Write code. Run commands. Commit atomically.
   |
   v
REFLECT: Did it work? Log outcome.
   |
   v
VERIFY: Run tests. Check build. Validate against spec.
   |
   +--[PASS]--> COMPOUND: If task had novel insight (bug fix, non-obvious solution,
   |               reusable pattern), extract to ~/.loki/solutions/{category}/{slug}.md
   |               with YAML frontmatter (title, tags, symptoms, root_cause, prevention).
   |               See skills/compound-learning.md for format.
   |               Then mark task complete. Return to REASON.
   |
   +--[FAIL]--> Capture error in "Mistakes & Learnings".
               Rollback if needed. Retry with new approach.
               After 3 failures: Try simpler approach.
               After 5 failures: Log to dead-letter queue, move to next task.

PRIORITY 3: Autonomy Rules

These rules guide autonomous operation. Test results and code quality always take precedence.

Rule	Meaning
Decide and act	Make decisions autonomously. Do not ask the user questions.
Keep momentum	Do not pause for confirmation. Move to the next task.
Iterate continuously	There is always another improvement. Find it.
ALWAYS verify	Code without tests is incomplete. Run tests. Never ignore or delete failing tests.
ALWAYS commit	Atomic commits after each task. Checkpoint progress.
Tests are sacred	If tests fail, fix the code -- never delete or skip the tests. A passing test suite is a hard requirement.

Model Selection

Default since v5.3.0 (reaffirmed in v7.5.13): Haiku disabled for quality. Use --allow-haiku or LOKI_ALLOW_HAIKU=true to enable.

Task Type	Tier	Claude (default)	Claude (--allow-haiku)	Codex (GPT-5.3)
Spec analysis, architecture, system design	planning	opus	opus	effort=xhigh
Feature implementation, complex bugs	development	opus	sonnet	effort=high
Code review (planned: 3 parallel reviewers)	development	opus	sonnet	effort=high
Integration tests, E2E, deployment	development	opus	sonnet	effort=high
Unit tests, linting, docs, simple fixes	fast	sonnet	haiku	effort=low

Parallelization rule (Claude only): Launch up to 10 agents simultaneously for independent tasks.

Degraded mode (Codex/Cline/Aider): No parallel agents or Task tool. Codex has MCP support. Runs RARV cycle sequentially. See skills/model-selection.md.

Git worktree parallelism: For true parallel feature development, use --parallel flag with run.sh. See skills/parallel-workflows.md.

Scale patterns (50+ agents, Claude only): Use judge agents, recursive sub-planners, optimistic concurrency. See references/cursor-learnings.md.

Phase Transitions

BOOTSTRAP ──[project initialized]──> DISCOVERY
DISCOVERY ──[spec analyzed, requirements clear]──> ARCHITECTURE
ARCHITECTURE ──[design approved, specs written]──> DEEPEN_PLAN (standard/complex only)
DEEPEN_PLAN ──[plan enhanced by 4 research agents]──> INFRASTRUCTURE
INFRASTRUCTURE ──[cloud/DB ready]──> DEVELOPMENT
DEVELOPMENT ──[features complete, unit tests pass]──> QA
QA ──[all tests pass, security clean]──> DEPLOYMENT
DEPLOYMENT ──[production live, monitoring active]──> GROWTH
GROWTH ──[continuous improvement loop]──> GROWTH

Transition requires: All phase quality gates passed. No Critical/High/Medium issues.

Context Management

Your context window is finite. Preserve it.

Load only 1-2 skill modules at a time (from skills/00-index.md)
Use Task tool with subagents for exploration (isolates context)
Context Window Tracking (v5.40.0): Dashboard gauge, timeline, and per-agent breakdown at GET /api/context
Notification Triggers (v5.40.0): Configurable alerts when context exceeds thresholds, tasks fail, or budget limits hit. Manage via GET/PUT /api/notifications/triggers

Key Files

File	Read	Write
`.loki/session.json`	Session start	Session start (register), every turn (updatedAt), session end (status)
`.loki/state/orchestrator.json`	Every turn	On phase change
`.loki/queue/pending.json`	Every turn	When claiming/completing tasks
`.loki/queue/current-task.json`	Before each ACT	When claiming task
`.loki/specs/openapi.yaml`	Before API work	After API changes
`skills/00-index.md`	Session start	Never
`.loki/memory/index.json`	Session start	On topic change
`.loki/memory/timeline.json`	On context need	After task completion
`.loki/memory/token_economics.json`	Never (metrics only)	Every turn
`.loki/memory/episodic/*.json`	On task-aware retrieval	After task completion
`.loki/memory/semantic/patterns.json`	Before implementation tasks	On consolidation
`.loki/memory/semantic/anti-patterns.json`	Before debugging tasks	On error learning
`.loki/queue/dead-letter.json`	Session start	On task failure (5+ attempts)
`.loki/signals/HUMAN_REVIEW_NEEDED`	Never	When human decision required
`.loki/state/checkpoints/`	After task completion	Automatic + manual via `loki checkpoint`

Module Loading Protocol (Skills)

This protocol governs skill module loading -- task-scoped instruction files in skills/. It is distinct from the Memory System Progressive Disclosure (see below), which governs persistent memory layers in .loki/memory/.

1. Read skills/00-index.md (once per session)
2. Match current task to module:
   - Writing code? Load model-selection.md
   - Running tests? Load testing.md
   - Code review? Load quality-gates.md
   - Debugging? Load troubleshooting.md
   - Legacy healing? Load healing.md
   - Deploying? Load production.md
   - Parallel features? Load parallel-workflows.md
   - Architecture planning? Load compound-learning.md (deepen-plan)
   - Post-verification? Load compound-learning.md (knowledge extraction)
3. Read the selected module(s)
4. Execute with that context
5. When task category changes: Load new modules (old context discarded)

Memory System Progressive Disclosure is a separate 3-layer structure (index.json -> timeline.json -> episodic/*.json) for retrieving past episodes/patterns. See skills/memory.md and references/memory-system.md.

Invocation

Unified entry point (v6.84.0): loki start [SPEC|ISSUE-REF] auto-detects whether the input is a PRD file, an issue URL, an issue number, or another spec format (e.g. OpenAPI). No need to pick between loki start and loki run -- the single command handles all cases.

# Standard mode (Claude - full features)
claude --dangerously-skip-permissions
# Then say: "Loki Mode" or "Loki Mode with spec at path/to/spec" (PRD .md/.json, OpenAPI .yaml, etc.)

# Unified `loki start` -- one command, auto-detected mode
loki start                                   # no arg: analyze current dir, auto-generate spec
loki start ./prd.md                          # PRD mode (.md/.json/.txt/.yaml) -- a PRD is one form of spec
loki start ./openapi.yaml                    # SPEC mode (OpenAPI doc treated as the spec)
loki start owner/repo#123                    # ISSUE mode (GitHub specific repo)
loki start https://github.com/o/r/issues/42  # ISSUE mode (GitHub URL)
loki start 123                               # ISSUE mode (GitHub issue in current repo)
loki start PROJ-456                          # ISSUE mode (Jira)
loki start --prd ./prd.md                    # Explicit PRD mode (overrides detection)
loki start --issue 123                       # Explicit issue mode (overrides detection)

# With provider selection (supports .md and .json PRDs)
loki start --provider claude ./prd.md        # Default, full features
loki start --provider codex ./prd.json       # GPT-5.3 Codex, degraded mode
loki start --provider cline ./prd.md         # Cline CLI, degraded mode
loki start --provider aider ./prd.md         # Aider (18+ providers), degraded mode

# Parallel mode (git worktrees, Claude only)
loki start ./prd.md --parallel
loki start 123 --ship                        # Issue -> PR -> auto-merge

# Legacy: `loki run <issue>` still works but prints a deprecation notice.
# It is an alias for `loki start <issue>` and will be removed in a future major.

Provider capabilities:

Claude: Opus 4.6, 1M context (beta), 128K output, adaptive thinking, agent teams, full features (Task tool, parallel agents, MCP)
Codex: GPT-5.3, 400K context, 128K output, MCP support, --full-auto mode, degraded (sequential only, no Task tool)
Cline: Multi-provider CLI, degraded mode (sequential only, no Task tool)
Aider: 18+ provider backends, degraded mode (sequential only, no Task tool)
Google Gemini CLI: DEPRECATED starting v7.5.18 (upstream deprecated; runtime removed)
Anthropic Antigravity CLI: Coming soon

Human Intervention (v3.4.0)

When running with autonomy/run.sh, you can intervene:

Method	Effect
`touch .loki/PAUSE`	Pauses after current session
`echo "instructions" > .loki/HUMAN_INPUT.md`	Injects directive (requires `LOKI_PROMPT_INJECTION=true`)
`touch .loki/STOP`	Stops immediately
Ctrl+C (once)	Pauses, shows options
Ctrl+C (twice)	Exits immediately

Security: Prompt Injection (v5.6.1)

DISABLED by default for enterprise security. Prompt injection via HUMAN_INPUT.md is blocked unless explicitly enabled.

# Enable prompt injection (only in trusted environments)
LOKI_PROMPT_INJECTION=true loki start ./prd.md

# Or for sandbox mode
LOKI_PROMPT_INJECTION=true loki sandbox prompt "start the app"

Hints vs Directives

Type	File	Behavior
Directive	`.loki/HUMAN_INPUT.md`	Active instruction (requires `LOKI_PROMPT_INJECTION=true`)

Example directive (only works with LOKI_PROMPT_INJECTION=true):

echo "Check all .astro files for missing BaseLayout imports." > .loki/HUMAN_INPUT.md

Complexity Tiers (v3.4.0)

Auto-detected or force with LOKI_COMPLEXITY:

Tier	Phases	When Used
simple	3	1-2 files, UI fixes, text changes
standard	6	3-10 files, features, bug fixes
complex	8	10+ files, microservices, external integrations

Managed Agents Integration (v7.2.0)

Opt-in integration with Claude Managed Agents (released Apr 2026). Gives Loki cross-project audited memory and real multiagent councils. Features are BAKED INTO existing RARV-C and council flows -- no new commands to learn.

All flags default false. Default behavior is identical to v7.2.0.

Flag	Purpose	Status
`LOKI_MANAGED_AGENTS`	Parent gate; required for every managed path	stable
`LOKI_MANAGED_MEMORY`	REASON augment + REFLECT shadow-write from `.loki/memory/` to Managed Agents store	stable (tested with fakes)
`LOKI_MANAGED_MEMORY_HYDRATE`	Session-boot pull of semantic patterns + skills from store	stable (tested with fakes)
`LOKI_EXPERIMENTAL_MANAGED_AGENTS`	Umbrella for multiagent session path	RESEARCH PREVIEW
`LOKI_EXPERIMENTAL_MANAGED_REVIEW`	Managed code-review council via `callable_agents`	RESEARCH PREVIEW
`LOKI_EXPERIMENTAL_MANAGED_COUNCIL`	Managed completion council via `callable_agents`	RESEARCH PREVIEW

Fail-fast: child-on + parent-off exits 2 with clear error. API unreachable falls back to local path with a managed_agents_fallback event to .loki/managed/events.ndjson. No retry storm.

Flip-on order (recommended):

LOKI_MANAGED_AGENTS=true LOKI_MANAGED_MEMORY=true (memory mirror).
Add LOKI_MANAGED_MEMORY_HYDRATE=true after one-week soak.
Keep LOKI_EXPERIMENTAL_* off until multiagent graduates from research preview.

NOT TESTED against live Anthropic API. Automated CI uses memory/managed_memory/fakes.py. Beta header pinned to managed-agents-2026-04-01. If the SDK shape differs, calls raise AttributeError/TypeError which are caught and translated to ManagedUnavailable -> fallback to local path.

See skills/memory.md for the full integration guide.

Phase 1 RARV-C Closure (v7.5.x)

The current track wires real evidence into RARV-C feedback. Documented here and in loki internal --help:

Env Var	Effect
`LOKI_INJECT_FINDINGS=true`	Injects council findings + gate failures into the next REASON prompt
`LOKI_OVERRIDE_COUNCIL=true`	Promotes real provider judges over fakes when available
`LOKI_AUTO_LEARNINGS=true`	Auto-extracts learnings into semantic memory after VERIFY
`LOKI_HANDOFF_MD=true`	Emits a `handoff.md` continuity doc at session boundaries

See references/core-workflow.md for the full RARV-C contract.

Concurrency and Security Hardening (v7.5.7 - v7.5.13)

Three back-to-back patches closed cross-process and security gaps. No user-facing behavior change on the default flow; verify via the cited paths.

Cross-process file locks on append-or-rewrite state, so parallel runs / dashboard / MCP do not corrupt shared files: gate counter (autonomy/run.sh gate-counter writes), task queues (autonomy/run.sh queue read-modify-write), checkpoint index (autonomy/run.sh checkpoint index updates), events.jsonl append (event emission paths in events/emit.sh and autonomy/run.sh), human intervention signal files (autonomy/run.sh:check_human_intervention() at line ~8059 / 7897 per state-machine doc).
MCP path validation -- file/path arguments to mcp/server.py tools are normalized and rejected if they escape the project root (path-traversal fix from v7.5.8).
Dashboard auth now required on /api/memory/*, /api/learning/*, and /api/status in dashboard/server.py (previously unauthenticated read paths).
Bash quoting hardening across autonomy/run.sh and autonomy/loki -- variable expansions inside command substitution and [ ] tests quoted to prevent word-splitting on paths with spaces.

See CHANGELOG.md entries [7.5.7], [7.5.8], [7.5.13] for the per-fix list and reviewer sign-off.

Implemented Features

Feature	Added	Notes
Multi-provider support (4 providers)	v5.0.0	claude, codex, cline, aider -- see `providers/`
CONTINUITY.md working memory	v5.35.0	Auto-managed by run.sh, updated each iteration
Quality gates 3-reviewer system	v5.35.0	5 specialist reviewers in `skills/quality-gates.md`; execution in run.sh
Memory System (episodic/semantic/procedural)	v5.15.0	Full implementation in `memory/`
Context Window Tracking	v5.40.0	Dashboard gauge, per-agent breakdown at `GET /api/context`
Notification Triggers	v5.40.0	`GET/PUT /api/notifications/triggers`
GitHub integration	v5.42.2	Import, sync-back, PR creation, export. CLI: `loki github`, API: `/api/github/*`
Legacy System Healing	v6.67.0	`loki heal <path>` -- friction-as-semantics, characterization tests
Unified `loki start`	v6.84.0	Auto-detects spec (PRD, OpenAPI, etc.) vs issue input
Managed Agents (memory mirror)	v7.2.0	Opt-in via `LOKI_MANAGED_AGENTS` -- see Managed Agents section
Bun runtime (Phase 1)	v7.3.0	Read-only commands routed through `bin/loki`; `LOKI_LEGACY_BASH=1` to revert
Phase 1 RARV-C closure	v7.5.x	Findings injection, real judges, auto-learnings, handoff.md

Planned / In-Progress Features

Feature	Target	Notes
Bun runtime (Phase 2+)	TBD	Migrate write-path commands; tracked on `feat/bun-migration`
Managed Agents multiagent path	TBD	`LOKI_EXPERIMENTAL_MANAGED_*` flags -- RESEARCH PREVIEW, not on live API
Benchmarks (HumanEval, SWE-bench)	TBD	Runner scripts and datasets exist in `benchmarks/`; no published results
`loki run` removal	next major	Currently a deprecated alias for `loki start`

Deprecated

Item	Deprecated In	Notes
`loki run <issue>`	v6.84.0	Alias for `loki start`. Will be removed in next major.
VSCode extension (`vscode-extension/`)	v7.2.0	No longer actively maintained; dashboard web UI is the supported front-end.

v7.8.3 | Autonomi flagship product | ~260 lines core

Installation

Option 1: Use slash command in Claude Code

/install-skill https://github.com/asklokesh/claudeskill-loki-mode

Option 2: Clone to skills directory

# Global (all projects)

git clone https://github.com/asklokesh/claudeskill-loki-mode ~/.claude/skills/claudeskill-loki-mode

# Project-specific

git clone https://github.com/asklokesh/claudeskill-loki-mode .claude/skills/claudeskill-loki-mode

Add MCP server to .cursor/mcp.json:

{
  "mcpServers": {
    "skillz": {
      "command": "npx",
      "args": ["-y", "skillz-mcp", "https://github.com/asklokesh/claudeskill-loki-mode"]
    }
  }
}

Restart Cursor after adding the configuration.

Option 1: Use Gemini CLI command

gemini extensions install https://github.com/asklokesh/claudeskill-loki-mode

Option 2: Clone to extensions directory

git clone https://github.com/asklokesh/claudeskill-loki-mode ~/.gemini/extensions/claudeskill-loki-mode

Topics

ai-agents aider anthropic autonomous ci-cd claude cline code-review devops gemini github-action github-issues loki-mode multi-agent openai-codex openapi pull-request-review sdlc spec-driven-development