New Multi-AI research loop — researched, judged by an independent model, refined until it passes

Describe what you need.
Get an expert that knows it cold.

Chiron’s Forge runs your request through multiple AI research engines, synthesizes the findings, and has an independent model judge the result — refining until it passes. You get a finished skill file, research report, or Cursor ruleset. Not a draft. A deliverable.

Try free See how it works

Your first build is free. No card. No subscription — you pay once, per build.

chironsforge.com/build/churn-sequenceBuilding

Deep research · 3 models

Claudedone

GPTdone

Gemini82%

Judge · weighted scoring

Specificity9.4

Grounding9.1

Tone match8.6

Coverage9.2

Winner Claude · refined ×29.1

Skill file · ready

# churn-sequence.skill.md
audience: "B2B SaaS, 60-day inactive"
voice: "direct, no hype"
steps: 5
proof_points: 7
guardrails: 4

# 1.8k tokens · versioned

Download skill

Outputs

One engine. Three things it can build.

Every build starts the same way — you describe what you need, and the research pipeline goes to work. What changes is the format you get at the end. Pick the one your team will actually use.

Skill files

A SKILL.md your agents load at runtime.

A single, procedural, action-oriented skill file — the kind you drop into Claude, an agent framework, or any tool that reads SKILL.md. It’s scored by an independent judge before you get it, so it behaves like domain knowledge, not a guess.

SKILL.md · judge-scored · plain Markdown you own and export

Research reports

A cited brief you can hand to a decision-maker.

A focused 4–6 page report: executive summary, findings by theme, recommendations, and a source list with real citations. For medical and scientific work, reports can be evidence-graded — drawing on PubMed, Semantic Scholar, and OpenAlex, with every claim tied to a DOI and the grade verified before delivery.

.zip (report.md + sources.json) · citations included · evidence grading available

Cursor rules

A ruleset that matches how your codebase actually works.

A .cursor/rules bundle that drops straight into your project. The free version is built from the stack you declare. The paid version reads your actual repository — your naming, your structure, your patterns — and writes rules no template could reproduce.

.cursor/rules bundle · 4–6 .mdc files · repo-aware on Signature

The difference

A second AI checks the work before you see it.

Most AI tools hand you the first thing the model produced. Chiron’s Forge doesn’t. Here’s what runs before anything reaches you.

An independent judge scores every build.

One model writes. A different model — never the same one, never Anthropic — scores the result against a rubric. The build keeps refining until it clears the bar, and the best-scoring version is the one you get. Not the last one. The best one.

Multiple research engines, run in parallel.

Depending on your tier, Gemini, OpenAI, and DeepSeek research your topic at the same time. Their findings are synthesized and gap-checked — so the output isn’t one model’s blind spots repeated, it’s several independent passes reconciled.

Gaps get found and filled, not ignored.

Between research passes, the pipeline checks what’s missing and goes back for it — up to four times on the deepest tier. A weak answer triggers more research, not a shrug.

Your data is grounded, then deleted.

Upload CSV, Excel, or PDF files to ground a build in your own material. Personal information is redacted before analysis, and the files are deleted when the build ends. Nothing is kept.

How it works

Five steps. One finished deliverable.

1
Describe
Chat with Claude to scope exactly what you need.
2
Upload
Optionally attach your own data to ground the build.
3
Research
Multiple AI engines research your topic in parallel.
4
Judge
An independent model scores the result and sends it back for refinement until it passes.
5
Download
Get your skill file, report, or ruleset — finished, not a first draft.

Pricing

Pay once, per build. No subscription.

You’re not renting access. Each build is a one-time purchase, and your first one is free. Pick your output, pick your depth.

A SKILL.md your agents load at runtime.

SKILL.md · judge-scored · plain Markdown you own and export

Quick

Freefirst build

Your first build, no card. One research pass, synthesized, SKILL.md download.

Try free

Deep

$19per build

Two research engines, gap-checked, judge-scored (75/100 bar), one refinement pass.

Start a build

Pro

$39per build

Deep research mode, three passes, two refinement passes, data upload.

Start a build

All prices in USD · one-time, per build · your first build is free.

Your first build is free.

Describe what you need. See what comes back. Decide from there.

Try free See how it works

No card required. No subscription. One free build to see the quality for yourself.

Describe what you need.Get an expert that knows it cold.

One engine. Three things it can build.

A SKILL.md your agents load at runtime.

A cited brief you can hand to a decision-maker.

A ruleset that matches how your codebase actually works.

A second AI checks the work before you see it.

An independent judge scores every build.

Multiple research engines, run in parallel.

Gaps get found and filled, not ignored.

Your data is grounded, then deleted.

Five steps. One finished deliverable.

Describe

Upload

Research

Judge

Download

Pay once, per build. No subscription.

Your first build is free.

Describe what you need.
Get an expert that knows it cold.