//moonshift · blog · teardown

From idea to live product in 5 minutes: a real Moonshift run

A minute-by-minute teardown of an actual Moonshift run: ten phases, fourteen agents, one deployed URL, one owned repo, and a launch kit parked for approval. No hand-waving - the agent names, the artifacts they write, and the handoffs that make it work.

Apr 19, 2026·9 min read·teardown · pipeline · agents

The marketing line is "type your idea, wake up launched." This post is the thing underneath the line. If you've ever wondered what actually happens between you pressing enter on a prompt and a deployed URL showing up in your dashboard, here's the whole pipeline, agent by agent, artifact by artifact.

For this teardown we're using a real run from our fixtures - a small weather-cards app, prompt was literally "a simple app that shows today's weather for the user's current location in beautiful cards." End-to-end: 4m 52s. Cost: $3.18. Agents invoked: 14. Artifacts written: 27. Let's walk it.

The shape: ten phases, fourteen agents

Moonshift runs a directed acyclic graph of ten phases. Some phases are single agents; some run agents in parallel. The contract between phases is strict - every agent output is validated against a Zod schema by a dedicated contract-validator agent before downstream consumers are allowed to read it. The full phase order:

Phase	Agents	Parallel?	Typical time
1	planner	no	~38s
2	db	no	~21s
3	backend	no	~55s
4	frontend, tests	yes	~70s
5	contract-validator (retry: fixer, max 3)	no	~8s
6	deployer, marketer	yes	~90s
7	auditor, auditor-security (retry: audit-fixer, max 1)	yes	~30s
8	image-gen	no	~22s
9	publisher	no	~4s (drafts only)

Total wall clock is lower than the sum because of parallelism - the frontend-plus-tests pair, the deployer-plus-marketer pair, and the two auditors overlap aggressively. The marketer agent, in particular, doesn't wait for the deploy to finish: it resolves the predicted Vercel URL from slug.json and writes its copy against that URL in parallel with the deploy itself. On this run, that saved 22 seconds.

Phase 1 - planner (0:00 → 0:38)

The planner reads your prompt and writes a spec.json that every downstream agent consumes. For the weather run, the spec contained:

product_name: 'Today Weather Cards'
target_user: 'people who want a fast, visually polished weather check before heading out'
success_metric: 'user can publish a weather snapshot in under 3 seconds after granting location'
pages: ['/'], with a single-card-stack layout driven by geolocation
APIs: Open-Meteo (no key required - the planner actively picks unauthenticated sources when possible to skip env-var setup)
data model: no DB needed; planner flagged `db_required: false`

That last point is load-bearing. The planner's job isn't just to dream up features; it's to tell every downstream agent what work they can skip. Phase 2 (db) saw db_required: false in the spec and wrote an empty migration plan in 0.4 seconds instead of the usual 21. Shaves cost.

Phase 2 - db (0:38 → 0:39)

For runs that need persistence, the db agent writes Drizzle schema files and the Turso migration plan. For this one, it no-opped and wrote an empty SCHEMA.json. The db agent is also the only one allowed to issue CREATE TABLE statements; this separation is enforced by the contract validator and prevents a later agent from silently mutating the schema.

Phase 3 - backend (0:39 → 1:34)

Backend writes API route handlers under app/api/**. For this app it wrote a single GET /api/weather?lat=..&lon=.. that proxies Open-Meteo and normalizes the response into seven card fields. The backend agent is also responsible for writing its own minimal types and exporting them for the frontend to consume - this is why the contract-validator at phase 5 exists.

Phase 4 - frontend + tests, in parallel (1:34 → 2:44)

The fun phase. Two agents run concurrently: frontend writes the UI, tests writes Playwright specs that prove the UI actually works against the backend. Running them in parallel is a bet that the frontend and tests can be written from the same spec contract without stepping on each other - which they can, because both consume spec.json and the backend's exported types, not each other.

On this run, frontend produced app/page.tsx, components/WeatherCard.tsx, and a short Tailwind layout. Tests produced three Playwright specs: a happy-path render, a permission-denied geolocation fallback, and a network-error state. All three passed on first attempt.

Phase 5 - contract-validator (2:44 → 2:52)

This is the phase that makes the rest of the pipeline boring in a good way. The contract-validator agent loads every upstream artifact and validates it against its Zod schema. Any mismatch between what the backend exported and what the frontend imported shows up here, not in production.

If validation fails, it dispatches the fixer agent with a targeted diff prompt. Up to three retries. On this run, zero retries needed. On a typical run, one in every five needs a single retry, usually because the frontend tried to destructure a field the backend hadn't exported yet.

Phase 6 - deployer + marketer, in parallel (2:52 → 4:22)

Deployer pushes the built app to your Vercel account under a slug read from slug.json. For this run, the slug was today-weather-cards and the URL that came back was https://today-weather-cards.vercel.app.

Marketer runs in parallel. It doesn't wait for deploy because both agents already know the slug. Marketer writes:

COPY.json - X tweet, X thread variant, LinkedIn post, hashtags split per platform.
A product one-liner that will be reused by image-gen in the next phase as part of the hero prompt.

The parallelism matters. On this run, the deploy took 88s and the marketer took 24s. Sequentially that's 112s; in parallel it's 88s. Multiply by every run ever and it's a real amount of lunch.

Phase 7 - auditor + auditor-security (4:22 → 4:52)

Two agents in parallel, each writing its own report. Auditor grades quality on a 0–100 scale and enumerates issues by severity. Auditor-security grades risk separately and flags anything that would embarrass you in a security scan.

For this run: auditor scored 84, auditor-security scored 92. One "info"-level suggestion (add a loading skeleton) and one "low"-level suggestion (cache Open-Meteo responses for 5 minutes). No errors, no security flags. Both reports are attached to the run and surfaced on the publisher gate.

Phase 8 - image-gen (4:52 → 5:14)

Image-gen composes a hero prompt from the product name, the one-liner marketer wrote, and a "visual category" hint the planner tagged. It renders an OG-sized hero (1200×630) and writes it to R2. The URL is attached to the run. Moonshift doesn't ship a single stock photo - every hero is rendered per-run against the actual product brief.

Phase 9 - publisher (5:14 → 5:18)

Publisher is the shortest phase, because it does the least. It reads the marketer's copy, the image-gen URL, and the audit verdicts, and writes three drafts to your dashboard: the X post, the LinkedIn post, and a preview of each with the hero image attached exactly as it'll appear when posted.

It does not call the X API. It does not call the LinkedIn API. It parks. If you want to know why that matters, see the post on auditable autonomy.

What you actually end up with

At the five-minute mark, your dashboard shows:

A live deploy URL on your Vercel. You can open it on your phone. You can share it with one person. You shouldn't share it with a thousand people yet.
A GitHub repo in your account, with the orchestrator's commit history. Clone it; own it; delete it if you don't want it. No sandbox, no Moonshift-prefixed org.
Two drafted social posts, rendered the way they'll appear when published, with the hero image already attached.
A hero image, rendered from scratch, at the right dimensions for OG + Twitter card.
Two audit reports, each with a score and an enumerated issue list. The publisher gate will warn you if either is below threshold.

The whole run came in under the per-run spend ceiling this time - the orchestrator enforces one between phases so you can never overshoot. If you're curious where the cost went, the run log breaks it down per-agent. Planner is usually the heaviest (context-heavy); image-gen is the spikiest; publisher is essentially free.

What you don't end up with

This is a short section but an important one. You don't end up with: a domain you forgot to register, a Notion page with an unfinished launch post, a half-drawn hero image, a "I'll do the X thread later" todo, a commit hash you can't remember where you pushed, or a deploy on a vendor sandbox that'll expire in 72 hours. The whole point of the pipeline is to drain that list.

Go try it. If your first run doesn't feel like cheating, write us and tell us what you wanted that we didn't ship.

← all posts