Run

Name: Run
Author: Alireza Rezvani

Run a single experiment iteration. Edit the target file, evaluate, keep or discard.

$ npx promptcreek add run

Auto-detects your installed agents and installs the skill to each one.

What This Skill Does

The /ar:run skill executes a single iteration of an experiment, guiding the user through the process of reviewing history, deciding on a change, editing the target file, committing the change, and evaluating the result. It provides a structured approach to experimentation and optimization.

When to Use

Running a single iteration of an experiment.
Choosing an experiment from a list.
Reviewing experiment history.
Deciding what to try next based on previous results.

Key Features

Resolves the experiment to run.

Loads the context of the experiment.

Guides the user in deciding what to try.

Reports the result of the experiment.

Installation

Run in your project directory:

$ npx promptcreek add run

Auto-detects your installed agents (Claude Code, Cursor, Codex, etc.) and installs the skill to each one.

View Full Skill Content

/ar:run — Single Experiment Iteration

Run exactly ONE experiment iteration: review history, decide a change, edit, commit, evaluate.

Usage

/ar:run engineering/api-speed              # Run one iteration
/ar:run                                     # List experiments, let user pick

What It Does

Step 1: Resolve experiment

If no experiment specified, run python {skill_path}/scripts/setup_experiment.py --list and ask the user to pick.

Step 2: Load context

# Read experiment config
cat .autoresearch/{domain}/{name}/config.cfg

Read strategy and constraints
cat .autoresearch/{domain}/{name}/program.md

Read experiment history
cat .autoresearch/{domain}/{name}/results.tsv

Checkout the experiment branch
git checkout autoresearch/{domain}/{name}

Step 3: Decide what to try

Review results.tsv:

What changes were kept? What pattern do they share?
What was discarded? Avoid repeating those approaches.
What crashed? Understand why.
How many runs so far? (Escalate strategy accordingly)

Strategy escalation:

Runs 1-5: Low-hanging fruit (obvious improvements)
Runs 6-15: Systematic exploration (vary one parameter)
Runs 16-30: Structural changes (algorithm swaps)
Runs 30+: Radical experiments (completely different approaches)

Step 4: Make ONE change

Edit only the target file specified in config.cfg. Change one thing. Keep it simple.

Step 5: Commit and evaluate

git add {target}
git commit -m "experiment: {short description of what changed}"

python {skill_path}/scripts/run_experiment.py \
  --experiment {domain}/{name} --single

Step 6: Report result

Read the script output. Tell the user:

KEEP: "Improvement! {metric}: {value} ({delta} from previous best)"
DISCARD: "No improvement. {metric}: {value} vs best {best}. Reverted."
CRASH: "Evaluation failed: {reason}. Reverted."

Step 7: Self-improvement check

After every 10th experiment (check results.tsv line count), update the Strategy section of program.md with patterns learned.

Rules

ONE change per iteration. Don't change 5 things at once.
NEVER modify the evaluator (evaluate.py). It's ground truth.
Simplicity wins. Equal performance with simpler code is an improvement.
No new dependencies.

0Installs

0Views

Supported Agents

Claude CodeCursorCodexGemini CLIAiderWindsurfOpenClaw

Attribution

Alireza Rezvani

alirezarezvani/claude-skills

MITseeded

Details

License: MIT
Source: seeded
Published: 3/17/2026

Related Skills

Agent Protocol

Inter-agent communication protocol for C-suite agent teams. Defines invocation syntax, loop prevention, isolation rules, and response formats. Use when C-suite agents need to query each other, coordinate cross-functional analysis, or run board meetings with multiple agent roles.

Alireza Rezvani

#c-level#c-level advisor

CTO Advisor

Technical leadership guidance for engineering teams, architecture decisions, and technology strategy. Use when assessing technical debt, scaling engineering teams, evaluating technologies, making architecture decisions, establishing engineering metrics, or when user mentions CTO, tech debt, technical debt, team scaling, architecture decisions, technology evaluation, engineering metrics, DORA metrics, or technology strategy.

Alireza Rezvani

#c-level#c-level advisor

Agent Workflow Designer

Alireza Rezvani

#engineering

Run

What This Skill Does

When to Use

Key Features

Installation

/ar:run — Single Experiment Iteration

Usage

What It Does

Step 1: Resolve experiment

Step 2: Load context

Read strategy and constraints

Read experiment history

Checkout the experiment branch

Step 3: Decide what to try

Step 4: Make ONE change

Step 5: Commit and evaluate

Step 6: Report result

Step 7: Self-improvement check

Rules

Supported Agents

Attribution

Details

Tags

Related Skills

Agent Protocol

CTO Advisor

Agent Workflow Designer