Setup

Name: Setup
Author: Alireza Rezvani

Set up a new autoresearch experiment interactively. Collects domain, target file, eval command, metric, direction, and evaluator.

$ npx promptcreek add setup

Auto-detects your installed agents and installs the skill to each one.

What This Skill Does

This skill sets up a new autoresearch experiment by collecting necessary configuration details. It supports both interactive mode, where it prompts the user for each parameter, and direct mode, where arguments are provided via the command line. It's useful for engineers and data scientists who want to automate and track experiments.

When to Use

Start a new engineering experiment.
Optimize a specific target file.
Compare different evaluation metrics.
List existing experiments.
Show available evaluators.
Quickly configure an experiment.

Key Features

Supports interactive and direct setup modes.

Validates the target file's existence.

Offers built-in evaluators for common metrics.

Allows specifying the scope of the experiment.

Provides a listing of existing experiments.

Shows available evaluators.

Installation

Run in your project directory:

$ npx promptcreek add setup

Auto-detects your installed agents (Claude Code, Cursor, Codex, etc.) and installs the skill to each one.

View Full Skill Content

/ar:setup — Create New Experiment

Set up a new autoresearch experiment with all required configuration.

Usage

/ar:setup # Interactive mode /ar:setup engineering api-speed src/api.py "pytest bench.py" p50_ms lower /ar:setup --list # Show existing experiments

/ar:setup --list-evaluators # Show available evaluators

What It Does

If arguments provided

Pass them directly to the setup script:

python {skill_path}/scripts/setup_experiment.py \
  --domain {domain} --name {name} \
  --target {target} --eval "{eval_cmd}" \
  --metric {metric} --direction {direction} \
  [--evaluator {evaluator}] [--scope {scope}]

If no arguments (interactive mode)

Collect each parameter one at a time:

Domain — Ask: "What domain? (engineering, marketing, content, prompts, custom)"
Name — Ask: "Experiment name? (e.g., api-speed, blog-titles)"
Target file — Ask: "Which file to optimize?" Verify it exists.
Eval command — Ask: "How to measure it? (e.g., pytest bench.py, python evaluate.py)"
Metric — Ask: "What metric does the eval output? (e.g., p50_ms, ctr_score)"
Direction — Ask: "Is lower or higher better?"
Evaluator (optional) — Show built-in evaluators. Ask: "Use a built-in evaluator, or your own?"
Scope — Ask: "Store in project (.autoresearch/) or user (~/.autoresearch/)?"

Then run setup_experiment.py with the collected parameters.

Listing

# Show existing experiments python {skill_path}/scripts/setup_experiment.py --list Show available evaluators

python {skill_path}/scripts/setup_experiment.py --list-evaluators

Built-in Evaluators

| Name | Metric | Use Case |

|------|--------|----------|

| benchmark_speed | p50_ms (lower) | Function/API execution time |

| benchmark_size | size_bytes (lower) | File, bundle, Docker image size |

| test_pass_rate | pass_rate (higher) | Test suite pass percentage |

| build_speed | build_seconds (lower) | Build/compile/Docker build time |

| memory_usage | peak_mb (lower) | Peak memory during execution |

| llm_judge_content | ctr_score (higher) | Headlines, titles, descriptions |

| llm_judge_prompt | quality_score (higher) | System prompts, agent instructions |

| llm_judge_copy | engagement_score (higher) | Social posts, ad copy, emails |

After Setup

Report to the user:

Experiment path and branch name
Whether the eval command worked and the baseline metric
Suggest: "Run /ar:run {domain}/{name} to start iterating, or /ar:loop {domain}/{name} for autonomous mode."

0Installs

0Views

Supported Agents

Claude CodeCursorCodexGemini CLIAiderWindsurfOpenClaw

Attribution

Alireza Rezvani

alirezarezvani/claude-skills

MITseeded

Details

License: MIT
Source: seeded
Published: 3/17/2026

Related Skills

Agent Protocol

Inter-agent communication protocol for C-suite agent teams. Defines invocation syntax, loop prevention, isolation rules, and response formats. Use when C-suite agents need to query each other, coordinate cross-functional analysis, or run board meetings with multiple agent roles.

Alireza Rezvani

#c-level#c-level advisor

CTO Advisor

Technical leadership guidance for engineering teams, architecture decisions, and technology strategy. Use when assessing technical debt, scaling engineering teams, evaluating technologies, making architecture decisions, establishing engineering metrics, or when user mentions CTO, tech debt, technical debt, team scaling, architecture decisions, technology evaluation, engineering metrics, DORA metrics, or technology strategy.

Alireza Rezvani

#c-level#c-level advisor

Agent Workflow Designer

Alireza Rezvani

#engineering

Setup

What This Skill Does

When to Use

Key Features

Installation

/ar:setup — Create New Experiment

Usage

What It Does

If arguments provided

If no arguments (interactive mode)

Listing

Show available evaluators

Built-in Evaluators

After Setup

Supported Agents

Attribution

Details

Tags

Related Skills

Agent Protocol

CTO Advisor

Agent Workflow Designer