Pipeline V3 Tools Reference

This document covers the command-line tools for running and managing Pipeline V3 workflows.

Overview

Tool	Purpose	Location
`run-pipeline-v3.py`	Main pipeline runner	`bin/run-pipeline-v3.py`
`run-audit-v3.py`	Pattern-based code audit + fix	`bin/run-audit-v3.py`
`run-logic-audit-v3.py`	Human-level feature verification	`bin/run-logic-audit-v3.py`
`pipeline-supervisor.sh`	Auto-restart crashed pipelines	`bin/pipeline-supervisor.sh`

run-pipeline-v3.py

Main Pipeline Runner — Hand it a spec or task, wake up to working code.

Usage

# Run with a task description
./bin/run-pipeline-v3.py --task "Add user authentication" --repo ./my-app

# Run with a spec/design doc
./bin/run-pipeline-v3.py --spec docs/plans/design.md --repo ./my-app

# Dry run (planning only)
./bin/run-pipeline-v3.py --task "Add feature X" --repo ./my-app --dry-run

# Verbose output with custom timeout
./bin/run-pipeline-v3.py --task "Fix bug Y" --repo ./my-app --verbose --timeout 900

# V3.5: Multi-repo mode
./bin/run-pipeline-v3.py --task "Build user management" \
    --repos backend:./api frontend:./web

# V3.5: Multi-repo dry run
./bin/run-pipeline-v3.py --task "Add auth" --dry-run \
    --repos backend:./api frontend:./web

Options

Option	Default	Description
`--task, -t`	(required)	Task description (what to build)
`--spec, -s`	—	Path to spec/design document
`--repo, -r`	`.`	Target repository path
`--output, -o`	`repo/.pipeline/v3-output`	Output directory
`--max-iterations`	`15`	Max execution iterations
`--timeout`	`600`	Timeout per agent call (seconds)
`--verbose, -v`	`false`	Detailed progress output
`--dry-run`	`false`	Plan only, don't execute
`--repos`	—	V3.5: Multi-repo mode (`role:path` pairs, e.g. `backend:./api frontend:./web`)

V3.5 Multi-Repo: When --repos is used instead of --repo, the pipeline activates cross-repo contract tracking, task routing per repo, and cascading task generation. See Pipeline V3.5 for details.

How It Works

Loads spec content (if provided) and combines with task
Initializes PM (Project Manager) with LLM caller
Planning Phase: Creates PRD, task graph, contracts
Design Review: Parallel review by Architect, Security, QA
Execution Phase: Assigns tasks to dev agents
Validation Phase: Runs tests and visual verification

Output Structure

.pipeline/
├── run.json              # Run state (for monitoring)
└── v3-output/
    ├── prd.json          # Product Requirements Document
    ├── task-graph.json   # Dependency graph
    ├── contracts/        # API specs, schemas
    ├── logs/             # Execution logs
    └── result.json       # Final result

Example: Feature Development

# Create a design doc
cat > docs/plans/auth.md << 'EOF'
# User Authentication

## Requirements
- Email/password login
- JWT tokens with refresh
- Password reset via email

## Endpoints
- POST /api/auth/login
- POST /api/auth/register
- POST /api/auth/refresh
- POST /api/auth/forgot-password
EOF

# Run pipeline
./bin/run-pipeline-v3.py \
  --spec docs/plans/auth.md \
  --repo /Users/matt/projects/my-app \
  --verbose

run-audit-v3.py

Pattern Audit — Codebase-wide scan for issues + automated fixes.

Usage

# Basic audit
./bin/run-audit-v3.py --repo /path/to/repo

# Focus on high severity only
./bin/run-audit-v3.py --repo /path/to/repo --min-severity high

# Focus on specific categories
./bin/run-audit-v3.py --repo /path/to/repo --categories credentials,incomplete

# Scan only (dry run)
./bin/run-audit-v3.py --repo /path/to/repo --dry-run

Options

Option	Default	Description
`--repo`	(required)	Path to repository to audit
`--min-severity`	`medium`	Minimum severity: `critical`, `high`, `medium`, `low`
`--categories`	all	Comma-separated: `credentials`, `incomplete`, `hardcoded`, etc.
`--max-fixes`	`20`	Maximum fixes to attempt
`--output, -o`	`.audit`	Output directory
`--verbose, -v`	`false`	Verbose output
`--dry-run`	`false`	Scan only, don't fix

Audit Phases

Phase 1: SCAN
  - Pattern matching for known issues
  - Regex-based credential detection
  - TODO/FIXME/incomplete code detection

Phase 2: CREATE TASKS
  - Group findings by file
  - Generate fix tasks

Phase 3: EXECUTE
  - Spawn Reeve agent for each fix
  - Apply surgical changes

Phase 4: VERIFY
  - Re-scan to confirm fixes
  - Report fix rate

Issue Categories

Category	What It Finds
`credentials`	Hardcoded API keys, passwords, tokens
`incomplete`	TODO, FIXME, unfinished implementations
`hardcoded`	Hardcoded URLs, magic numbers
`security`	SQL injection, XSS vulnerabilities
`error-handling`	Missing try/catch, unhandled errors
`logging`	console.log in production code

Output

.audit/
├── audit.jsonl           # Event stream
├── scan-results.json     # All findings
├── fix-tasks.json        # Generated fix tasks
├── fix-results.json      # Fix execution results
└── audit-report.json     # Final summary

Example

# Full audit with fixes
./bin/run-audit-v3.py \
  --repo /Users/matt/projects/backend \
  --min-severity medium \
  --verbose

# Output:
# 📊 Scan Results:
#    Files scanned: 147
#    Total findings: 23
#    Filtered (≥medium): 18
#
# 📝 Created 12 fix tasks
# 🔧 Executing fixes...
#    ✅ Fixed src/api/auth.py
#    ✅ Fixed src/utils/config.py
#
# 📊 Verification Results:
#    Original issues: 18
#    Fixed: 15
#    Remaining: 3
#    Fix rate: 83.3%

run-logic-audit-v3.py

Logic Audit — Human-level feature verification that reads docs, understands intent, tests reality, finds gaps, and fixes them.

Usage

# Basic logic audit
./bin/run-logic-audit-v3.py \
  --frontend /path/to/frontend \
  --backend /path/to/backend

# With running app URLs
./bin/run-logic-audit-v3.py \
  --frontend /path/to/frontend \
  --backend /path/to/backend \
  --url http://localhost:3000 \
  --api-url http://localhost:8000

# Analyze only (no fixes)
./bin/run-logic-audit-v3.py \
  --frontend /path/to/frontend \
  --backend /path/to/backend \
  --dry-run

Options

Option	Default	Description
`--frontend`	(required)	Path to frontend repo
`--backend`	(required)	Path to backend repo
`--url`	`http://localhost:3000`	Frontend app URL
`--api-url`	`http://localhost:8000`	Backend API URL
`--output, -o`	`.logic-audit`	Output directory
`--max-features`	`30`	Max features to audit
`--verbose, -v`	`false`	Verbose output
`--dry-run`	`false`	Analyze only, don't fix

Audit Phases

Phase 1: DISCOVERY
  - Find all docs, specs
  - Discover API endpoints
  - Find frontend pages/components
  - Build feature map

Phase 2: INTENT EXTRACTION
  - AI reads each doc
  - Understands expected behavior
  - Documents requirements

Phase 3: REALITY TESTING
  - Code analysis (what it actually does)
  - Browser testing (headless)
  - API testing (endpoint probing)

Phase 4: GAP ANALYSIS
  - Compare intent vs reality
  - Identify misalignments
  - Generate actionable gaps

Phase 5: FIX GENERATION
  - Create fixes for each gap
  - Apply via dev agents
  - Verify fixes

What It Finds

The logic audit catches issues that pattern matching can't:

Gap Type	Example
Missing validation	"Spec says validate email, code doesn't"
Missing error handling	"UI should show error on 401, shows nothing"
Incomplete features	"Endpoint exists but returns placeholder"
Wrong behavior	"Button says 'Submit' but sends 'Draft'"
Missing UI feedback	"No loading state during API call"
Accessibility gaps	"Missing ARIA labels on form"

Example

./bin/run-logic-audit-v3.py \
  --frontend /Users/matt/projects/app-frontend \
  --backend /Users/matt/projects/app-backend \
  --url http://localhost:3000 \
  --verbose

# Output:
# 📋 PHASE 1: DISCOVERY
#    📄 Found 12 documentation files
#    🔌 Found 24 API endpoints
#    📱 Found 8 frontend pages
#    📊 Discovery Results: 32 features to audit
#
# 📖 PHASE 2: INTENT EXTRACTION
#    [1/32] Analyzing intent: user-authentication
#    Intent: Should validate email format, password length 8+...
#
# 🔍 PHASE 3: REALITY TESTING
#    [1/32] Testing reality: user-authentication
#    Code analysis: Validates email, no password length check
#    Browser test: Login form present, shows errors
#
# ⚡ PHASE 4: GAP ANALYSIS
#    🔴 user-authentication: 2 gaps found
#       • Missing password length validation
#       • No "forgot password" link on login page
#
# 🏁 LOGIC AUDIT COMPLETE
#    Features audited: 32
#    Total gaps found: 14
#    Gaps fixed: 11
#    Fix rate: 78.6%

pipeline-supervisor.sh

Pipeline Supervisor — Auto-restart crashed pipelines. Designed for cron integration.

Usage

# Check and restart stale pipelines
./bin/pipeline-supervisor.sh

# Check only (don't restart)
./bin/pipeline-supervisor.sh --check-only

Exit Codes

Code	Meaning
`0`	All good (nothing to do or restart succeeded)
`1`	Restart failed
`2`	Check-only mode, restart needed

How It Works

Monitors repos listed in script (configurable)
Checks .pipeline/run.json for status: "running"
Verifies process exists via pgrep
If stale (status=running but no process):
- Extracts task from run.json
- Restarts pipeline with same parameters
- Updates restart counter

Configuration

Edit the script to add repos to monitor:

REPOS=(
  "/Users/mattrhodes/projects/backend"
  "/Users/mattrhodes/projects/frontend"
)

STALE_THRESHOLD=300  # Seconds before considering stale

Cron Integration

Add to Reeve's HEARTBEAT.md or system cron:

# Every 5 minutes, check for stale pipelines
*/5 * * * * /path/to/reeve/bin/pipeline-supervisor.sh >> /tmp/supervisor.log 2>&1

Or use Reeve heartbeat:

## Pipeline Health Check
- Frequency: 5 minutes
- Task: Run `./bin/pipeline-supervisor.sh` and alert if restart needed

Integration Patterns

Chaining Audit → Pipeline

# 1. First, audit the codebase
./bin/run-audit-v3.py --repo ./backend --dry-run

# 2. Review findings, create plan
# 3. Run pipeline to implement fixes
./bin/run-pipeline-v3.py \
  --task "Fix all medium+ severity audit findings" \
  --spec .audit/scan-results.json \
  --repo ./backend

Logic Audit After Feature Work

# After implementing a feature, verify it matches spec
./bin/run-logic-audit-v3.py \
  --frontend ./frontend \
  --backend ./backend \
  --dry-run

# Review gaps, fix critical ones
./bin/run-pipeline-v3.py \
  --task "Close gaps found in logic audit" \
  --spec .logic-audit/logic-audit-report.json \
  --repo ./backend

Supervisor + Pipeline Workflow

# Start long-running pipeline
nohup ./bin/run-pipeline-v3.py \
  --task "Major refactor" \
  --repo ./backend \
  --verbose \
  >> /tmp/pipeline.log 2>&1 &

# Supervisor will auto-restart if it crashes
# Check status:
cat ./backend/.pipeline/run.json | jq '.status'

Pipeline V3 Tools Reference

Pipeline V3 Tools Reference

Overview

run-pipeline-v3.py

Usage

Options

How It Works

Output Structure

Example: Feature Development

run-audit-v3.py

Usage

Options

Audit Phases

Issue Categories

Output

Example

run-logic-audit-v3.py

Usage

Options

Audit Phases

What It Finds

Example

pipeline-supervisor.sh

Usage

Exit Codes

How It Works

Configuration

Cron Integration

Integration Patterns

Chaining Audit → Pipeline

Logic Audit After Feature Work

Supervisor + Pipeline Workflow

See Also

On this page