AI Agent Setup Guide for `calibrated_explanations`¶

Goal: Configure any AI agent platform so it understands CE deeply, learns from your feedback, and stays in sync with every API change.

The canonical CE rules that apply to all agents live in CONTRIBUTOR_INSTRUCTIONS.md. Platform-specific setup files build on top of that canonical:

Platform

File

GitHub Copilot

.github/copilot-instructions.md + .github/prompts/

Codex (OpenAI)

AGENTS.md

Claude Code

CLAUDE.md + .claude/settings.json

Google Gemini

GEMINI.md

Platform	File
GitHub Copilot	`.github/copilot-instructions.md` + `.github/prompts/`
Codex (OpenAI)	`AGENTS.md`
Claude Code	`CLAUDE.md` + `.claude/settings.json`
Google Gemini	`GEMINI.md`

1. Prerequisites¶

Requirement	Notes
Python environment	Use one venv per CE branch; install editable: `pip install -e .[dev]`
Agent platform	See platform-specific file for tool installation steps
VS Code (Copilot only)	v1.90+ with Copilot and Copilot Chat extensions

Quick dev environment setup (all platforms):

python -m venv .venv
source .venv/bin/activate       # or .venv\Scripts\activate on Windows
pip install --upgrade pip
pip install -e .[dev]

2. How CE agent context is structured¶

Every agent platform reads a two-layer set of files:

Layer 1 — Canonical (all agents)¶

CONTRIBUTOR_INSTRUCTIONS.md at the repo root contains:

CE-first policy (the mandatory pre-explain checklist)
Architecture and module boundary rules
Coding standards (type hints, Numpy docstrings, lazy imports)
Testing standards and fallback visibility policy
ADR/Standards reference map (all 26 active ADRs + 5 STDs with when-to-consult guidance)
Test-quality improvement method (8-agent team from docs/improvement/test-quality-method/)
Key files & directories, development workflow, TDD patterns

Layer 2 — Platform-specific¶

Each platform file adds only what is unique to that agent:

File	Platform	What it adds
`.github/copilot-instructions.md`	GitHub Copilot	Auto-injected instruction files, prompt slash commands, `@workspace` chat tips
`AGENTS.md`	Codex	Session priming prompt, task template, workspace sync routine, CE-first utility list
`CLAUDE.md`	Claude Code	Permissions model (`.claude/settings.json`), bash tool rules, tool use guidance
`GEMINI.md`	Google Gemini	Session priming prompt, context management, workspace sync

3. Session priming (all platforms)¶

At the start of any agent session, prime the agent with:

You are a CE-first agent for calibrated_explanations. Read CONTRIBUTOR_INSTRUCTIONS.md
and <platform file> first. Use WrapCalibratedExplainer and the public CE API directly.
Fail fast if CE-first invariants are not satisfied.

Replace <platform file> with AGENTS.md, CLAUDE.md, or GEMINI.md as appropriate. For GitHub Copilot, the instructions are injected automatically — no priming needed.

4. GitHub Copilot — VS Code workspace setup¶

Open the repository root in VS Code. Copilot features are provided by the Copilot/Copilot Chat extensions and GitHub’s instruction-file loading. This repository does not require custom VS Code Copilot keys in .vscode/settings.json.

Copilot completions for Python, Markdown, and YAML.
Next Edit Suggestions – Copilot proposes the next logical edit as you type.
Instruction files – all .github/instructions/*.instructions.md files are automatically loaded into Copilot’s context based on the file you are editing.
Workspace agent – Copilot can search your local files when answering questions.

Instruction files injected automatically by VS Code:

File	Scope	Purpose
`.github/copilot-instructions.md`	all files	Architecture, coding standards, TDD policy, fallback rules
`.github/instructions/source-code.instructions.md`	`src/*/.py`	Module layout, import rules, docstring style
`.github/instructions/tests.instructions.md`	`tests/*`, `test*`	Testing framework, naming, coverage gate
`.github/instructions/execution plan.instructions.md`	all files	Release plan, ADR conformance, changelog policy

Prompt slash commands available in Copilot Chat:

Command	Use when
`/generate-tests-strict`	Writing new tests for any CE module
`/implement-plugin`	Scaffolding a new calibrator, plot, or explanation plugin
`/fix-issue`	Diagnosing and fixing a bug or failing test
`/refresh-ce-context`	Updating instruction files after an API change or to incorporate feedback

5. Keeping agents up to date¶

After an API change¶

For GitHub Copilot, run in Chat:
```
/refresh-ce-context module=calibrated_explanations.core.calibrated_explainer
```
For other platforms, ask the agent to read CONTRIBUTOR_INSTRUCTIONS.md, compare it against the current src/calibrated_explanations/ source, and propose minimal diffs.
Review the proposed diffs; accept or adjust.
Commit the updated instruction files alongside the code change.

After an ADR is accepted or closed¶

Run /refresh-ce-context adr=ADR-NNN (Copilot) or ask the agent to update the ADR status row in CONTRIBUTOR_INSTRUCTIONS.md §9.

Workspace sync routine (Codex / Claude / Gemini)¶

source .venv/bin/activate
pip install -e .[dev]
python -m pip check
pytest -q

Then ask the agent to re-read CONTRIBUTOR_INSTRUCTIONS.md and diff src/calibrated_explanations/ for changed signatures.

6. Feedback loop (all platforms)¶

No agent platform retains memory across unrelated sessions. The only way to make feedback durable is to encode it in versioned repository files.

Quick feedback¶

For GitHub Copilot, run:

/refresh-ce-context feedback="Agent suggested importing matplotlib at module level – it must always be lazy"

This appends a dated entry to .github/copilot-feedback-log.md and adds a clarifying bullet to the relevant instruction file.

For other platforms, update CONTRIBUTOR_INSTRUCTIONS.md directly with a new bullet in the relevant section, then add a dated entry to .github/copilot-feedback-log.md manually.

Use this mandatory entry schema:

**Feedback:** what the agent got wrong
**Root cause:** why the miss happened
**Durable fix:** exact instruction/test/script updates
**Verification:** command(s) proving the fix
**Status:** open | ✅ incorporated

Structured feedback review¶

After each sprint or release:

Open .github/copilot-feedback-log.md and review accumulated entries.
Verify each correction is reflected in CONTRIBUTOR_INSTRUCTIONS.md or the platform instruction file.
Mark resolved entries ✅ and commit the changes so every team member benefits.

7. Common CE workflows (all platforms)¶

Explain a prediction end-to-end¶

Read CONTRIBUTOR_INSTRUCTIONS.md, then explain how WrapCalibratedExplainer.explain_factual
works from the public call down to the plugin registry dispatch.

Scaffold a new plugin¶

/implement-plugin plugin_type=calibrator plugin_name=isotonic_regression target_adr=ADR-013

(Copilot slash command) or give the equivalent instruction to any other agent.

Run a test-quality improvement cycle¶

Read docs/improvement/test-quality-method/README.md, then act as the test-creator
agent and produce a prioritized coverage-gap analysis.

Check release readiness¶

Read docs/improvement/RELEASE_PLAN_v1.md and list all items still open for the
current milestone.

Debug a failing test¶

/fix-issue failing_test=tests/unit/core/test_explainer.py::test_calibration_state

8. Tips for best results (all platforms)¶

Keep instruction files short and precise. A densely written instruction is more useful than a long one.
Commit instruction-file updates in the same PR as the code change so history stays in sync.
Always validate with make test (or pytest -q) after any code change.
Pin ADR references in code comments when making architectural decisions so future agent sessions understand why a pattern is used.
Use the test-quality-method agents (test-creator, pruner, etc.) for large test changes rather than writing coverage-padding tests by hand.

AI Agent Setup Guide for calibrated_explanations¶