Folder Structure Protocol

01Most "bad AI output" is actually bad context.

Trace the symptom backward.

AI coding tools are only as good as the context they receive. When Claude opens your project, it has to figure out where things are, what conventions to follow, and which files matter for the current task. If your folders are disorganized, Claude wastes tokens guessing, reads the wrong files, or ignores your instructions entirely.

Most developers blame the model. But the real cause is usually structural:

Wrong format? Missing naming conventions, so Claude improvises a pattern.

Off-topic output? No routing table, so Claude loaded the wrong context file.

Ignoring instructions? Your CLAUDE.md is 200 lines, the real rules are buried at line 180.

Inconsistent across sessions? No CONTEXT.md files, so every session starts from scratch.

The 60/30/10 heuristic. Prioritize fixes in this order: 60% traditional structure (naming, organization, file grouping) → 30% routing (CLAUDE.md, CONTEXT.md, conventions) → 10% the AI interaction itself. Fix the 90% you can actually control (Van Clief & McDermott, 2026).

The industry data is consistent.

44 to 65% of developers blame missing context, not bad prompts, for poor AI-generated code.

Qodo, State of AI Code Quality, 2025

29% reduction in agent runtime from adding routing files like AGENTS.md.

Sherwood, martinfowler.com, 2025

"Curating what enters the model's attention budget" is how Anthropic frames the core engineering challenge.

Schluntz et al., Anthropic, 2025

02The architecture.

ICM is the methodology. FSP is the tooling layer on top.

ICM

Methodology · upstream

5-layer hierarchy
Stage contracts
Factory / Product split
5 design principles

builds tooling on

Van Clief & McDermott, 2026

FSP

Tooling · this repo

Scoring (X/16 + X/18)
Letter grading A to F
14 anti-patterns
5 structural metrics

The lifecycle loop

Five skills, one feedback loop. Each output is reviewable. None auto-advance.

/folder-audit

score X/16, find anti-patterns

/pipeline-
scaffold

create stages + contracts

/run-stage

execute with scoped context

/stage-review

advance / revise / re-run

/validate-
pipeline

check handoffs in chain

↑ ← ← ← fix and iterate ← ← ← ↑

The five layers

Green layers (0 and 1) are scored on every project. Blue layers (2, 3, and 4) score only if numbered stage folders are detected.

Layer 0

The MapCLAUDE.md, <50 lines, routing table

/folder-audit

Layer 1

The RoomsCONTEXT.md per workspace, <80 lines

/folder-audit · /pipeline-scaffold

Layer 2

Stage ContractsInputs / Process / Outputs per stage

/pipeline-scaffold · /validate-pipeline

Layer 3

The Factoryreferences/, _config/, voice, conventions

/run-stage (read, internalize)

Layer 4

The Productoutput/, per-run artifacts, drafts, data

/run-stage (write) · /stage-review

The model should embody factory material (write in this voice) but transform product material (convert this research into a script). The folder structure makes the distinction visible, and enforceable.

03The five skills.

Each one does one job. Each one is a markdown file. None of them auto-advance.

/folder-audit

Score any project's structure

Snapshots the file tree, scores Layers 0 and 1 plus Tools (X/16), checks 8 anti-patterns, measures 5 structural metrics, and generates a graded report with the top 3 prioritized fixes. Optionally imprints structure rules into the audited project's CLAUDE.md.

/pipeline-scaffold

Create a staged workflow

Generates numbered stage folders with CONTEXT.md contracts (Inputs / Process / Outputs / Review Checkpoint), references/ and output/ directories per stage, a shared _config/ folder, and a root CONTEXT.md with stage routing.

/run-stage

Execute one stage at a time

Reads the stage's contract, loads only declared inputs (Layer 3 as constraints, Layer 4 as material), follows the Process instructions, writes declared outputs, and pauses for human review. Never auto-advances. Never reads undeclared files.

/stage-review

Verify before advancing

Checks that all declared outputs exist, runs the Review Checkpoint criteria, assesses quality against the Process intent, confirms the next stage's inputs are satisfied. Read-only. Outputs a recommendation: advance, revise, or re-run.

/validate-pipeline

Check the contract chain

Walks the full pipeline to find broken handoffs (output/input mismatches), factory/product cross-contamination, missing contract sections, and structural anti-patterns. Offers to fix issues directly.

04Install.

Pick the lightest option that fits. You can always add more skills later.

Install just /folder-audit into your current project. Lightest possible footprint.

# from your project root
mkdir -p .claude/skills/folder-audit
curl -o .claude/skills/folder-audit/SKILL.md \
  https://raw.githubusercontent.com/mcmespinaa/folder-structure-protocol/main/.claude/skills/folder-audit/SKILL.md

Install all five skills into a single project's .claude/skills/.

# from your project root
git clone https://github.com/mcmespinaa/folder-structure-protocol.git /tmp/fsp
cp -r /tmp/fsp/.claude/skills/* .claude/skills/
rm -rf /tmp/fsp

Install globally so every Claude Code session has the skills.

# from anywhere
git clone https://github.com/mcmespinaa/folder-structure-protocol.git /tmp/fsp
mkdir -p ~/.claude/skills
cp -r /tmp/fsp/.claude/skills/* ~/.claude/skills/
rm -rf /tmp/fsp

Clone the full repo if you want to read source, customize, or contribute.

git clone https://github.com/mcmespinaa/folder-structure-protocol.git
cd folder-structure-protocol
cat README.md

Then, in Claude Code:

/folder-audit # score your current project
/folder-audit /path/to/other-project # score a different one

Works with Claude Code (CLI), Claude.ai Cowork, the Claude API, and the Agent SDK. The methodology also works for any AI coding tool, or for manual project organization.

05Philosophy.

Why this exists, and what it isn't.

Plain text. Single agent. Scoped context.

FSP inherits ICM's commitment to simplicity. All skills are markdown files, not orchestration code. One Claude session walks through numbered folders. Each stage loads only its declared inputs. Humans review at every stage boundary. The system is anti-fragile: when models update, instructions in plain English degrade gracefully rather than breaking.

It scores. It doesn't certify.

The X/16 and X/18 scoring rubric exists to make folder quality legible, not to gatekeep. A project graded "B" by FSP isn't broken. It works, it just needs occasional re-steering. The grades, anti-patterns, and 60/30/10 ratio (Van Clief & McDermott, 2026) are teaching shorthand for prioritizing fixes, not empirical measurements.

Built from friction, not from spec.

Every skill in this repo started as a manual checklist run dozens of times before being formalized. /folder-audit began as a mental walkthrough. /stage-review started as a hand-written checkpoint template. The methodology is recovered from practice, not designed upfront.

Disclaimer: This is not an official scoring tool or industry standard. It's an opinionated scaffolding tool, a structured way to think about folder architecture for AI-assisted projects, guided by principles from Interpretable Context Methodology (Van Clief & McDermott, 2026). The scores, grades, and heuristics (including the 60/30/10 rule) are practical shortcuts for personal and team use, not empirical measurements. Use what's useful, adapt what isn't, ignore what doesn't fit your project.