The Engine, Documented

Seven stages.One compoundingdataset.

Every artifact from every short logs to Grimoire. The pipeline teaches itself — then retrains the tools that made it.

Stages: 7
Per 30s short: $2.52
Wall clock: ~52 min

The Flow

Hover a node. Click to descend.

Each stage reads a typed JSON manifest from the previous stage and writes one for the next. Grimoire logs every artifact, in parallel, always.

01
Signal
~2 min · $0.00
02
Concept
~5 min · $0.02
03
Script
~10 min · $0.05
04
Storyboard
~15 min · $0.10
05
Generate
~12 min (parallel) · $2.35
06
Assemble
~5 min · $0.00
07
Distribute
~3 min · $0.00

Always on

Grimoire

The compounding dataset

01
Signal
Capture the trending topic that becomes a legend.
~2 min·$0.00
02
Concept
Signal becomes logline, mascot, and duration target.
~5 min·$0.02
03
Script
Concept becomes beats, VO lines, and on-screen text.
~10 min·$0.05
04
Storyboard
Script becomes shots, Flux prompts, and Kling motion.
~15 min·$0.10
05
Generate
Storyboard becomes voice, images, clips, and music.
~12 min (parallel)·$2.35
06
Assemble
Assets become a rendered MP4 via Remotion.
~5 min·$0.00
07
Distribute
MP4 becomes posts on every platform that matters.
~3 min·$0.00
Always on
Grimoire
Every artifact from every short, logged for compounding.

Seven stages · Read in order or jump around

Stage 01

Signal

Capture the trending topic that becomes a legend.

Duration: ~2 min
Cost (30s short): $0.00
Skill: MCP direct
MCP tools: 2 tools

Every legend begins as a whisper in the culture. Stage one pulls live trending data from TikTok, YouTube, Reddit, and X, scores each topic against our archetype taxonomy, and selects the signal with the highest legend-potential — never the highest raw volume.

MCP Tool Calls

signal_get_trending()
signal_score_topic()

No input manifest

This stage is the entry point. It pulls live data from trending sources rather than reading a prior manifest.

Outsignal.json

Writes for next stage + Grimoire

{
  "id": "sig_2026_04_18_0001",
  "topic": "rise-of-the-quiet-swordsman",
  "hook_angle": "The strongest blade is the one never drawn.",
  "trending_score": 0.87,
  "sources": [
    { "platform": "tiktok", "volume": 41200, "velocity": 2.3 },
    { "platform": "reddit", "volume": 8900,  "velocity": 1.7 },
    { "platform": "x",      "volume": 15600, "velocity": 1.9 }
  ],
  "archetype_match": "stoic_warrior",
  "timestamp": "2026-04-18T09:22:14Z"
}

Quality Gate

Trending score >= 0.6 (below that, topic is noise).
At least two source platforms corroborate the signal.
Archetype matches an entry in `brand/archetypes.yaml`.
No copyrighted character names appear anywhere in the topic string.

What Can Break

All scraped topics score below threshold.
Fallback · Fall back to the evergreen archetype rotation in `grimoire/evergreen_archetypes.sql` — we always have a legend to tell.
Source API rate-limited.
Fallback · Cached trending window from the last successful pull (max 6h stale), with a warning logged to Grimoire.

Stage 02

Concept

Signal becomes logline, mascot, and duration target.

Duration: ~5 min
Cost (30s short): $0.02
Skill: concept-forge
MCP tools: —

Pure reasoning stage. The concept-forge skill takes the signal and forges a one-sentence logline, assigns the right mascot (AKASHI for cosmic narration, KAGE for stoic analysis, MIRA for curious wonder), picks the primary platform, and decides how long the short should breathe.

Insignal.json

Reads from previous stage

{
  "id": "sig_2026_04_18_0001",
  "topic": "rise-of-the-quiet-swordsman",
  "hook_angle": "The strongest blade is the one never drawn.",
  "archetype_match": "stoic_warrior"
}

Outconcept.json

Writes for next stage + Grimoire

{
  "id": "con_2026_04_18_0001",
  "signal_id": "sig_2026_04_18_0001",
  "logline": "A swordsman who refuses to draw becomes the one everyone fears.",
  "mascot": "kage",
  "platform_primary": "tiktok",
  "duration_target_s": 30,
  "hook": "What if the strongest fighter never swings?",
  "payoff": "Because the threat of the blade is louder than the blade.",
  "archetype_language": [
    "the stoic warrior",
    "restraint as power",
    "the unsheathed answer"
  ]
}

Quality Gate

Logline is a single sentence, under 18 words.
Mascot field matches a character sheet in `mascots/`.
No copyrighted character names — archetype language only.
Duration target within {15, 30, 45, 60} seconds.

What Can Break

Logline fails the copyright scan.
Fallback · Regenerate with an explicit constraint injected into the prompt listing the blocked names. Max 3 retries, then skip the signal.
Mascot assignment is ambiguous.
Fallback · Default to AKASHI (the cosmic narrator works for any archetype) and flag for human review in Grimoire.

Stage 03

Script

Concept becomes beats, VO lines, and on-screen text.

Duration: ~10 min
Cost (30s short): $0.05
Skill: script-forge
MCP tools: —

The script-forge skill expands the concept into a beat sheet. Each beat is timed to the second, carries the voiceover line, the on-screen text overlay, the visual direction for storyboard, and an audio cue for the music bed.

Inconcept.json

Reads from previous stage

{
  "id": "con_2026_04_18_0001",
  "logline": "A swordsman who refuses to draw becomes the one everyone fears.",
  "mascot": "kage",
  "duration_target_s": 30
}

Outscript.json

Writes for next stage + Grimoire

{
  "id": "scr_2026_04_18_0001",
  "concept_id": "con_2026_04_18_0001",
  "mascot": "kage",
  "duration_s": 30,
  "style_notes": "Measured cadence. Pauses over emphasis. Let silence strike.",
  "beats": [
    {
      "beat_id": "b1",
      "timing_s": [0, 4],
      "vo_text": "They called him the quiet one.",
      "on_screen_text": "THE QUIET ONE",
      "visual_direction": "Wide. Figure alone on a rain-wet rooftop, sword sheathed.",
      "audio_cue": "sub-bass swell, distant thunder"
    },
    {
      "beat_id": "b2",
      "timing_s": [4, 10],
      "vo_text": "Three masters came. Three masters left.",
      "on_screen_text": "3 MASTERS",
      "visual_direction": "Cut between three silhouettes approaching, then retreating.",
      "audio_cue": "taiko hit per master, room tone between"
    }
  ],
  "voiceover_lines": [
    "They called him the quiet one.",
    "Three masters came. Three masters left."
  ],
  "on_screen_text": ["THE QUIET ONE", "3 MASTERS"]
}

Quality Gate

Total timing of beats equals duration_s within ±0.5s.
Every beat has vo_text, on_screen_text, visual_direction, audio_cue.
Voice cadence matches the mascot's voice card in `mascots/<name>/voice.md`.
Copyright scan passes on all vo_text and on_screen_text.

What Can Break

Beat timings drift outside the duration target.
Fallback · Automatic rebalance pass — trim the longest beat by the drift amount, reflow timings.
On-screen text exceeds 4 words per beat.
Fallback · Compression pass using the `brand/voice.md` compression rules. Fail build if no compression possible.

Stage 04

Storyboard

Script becomes shots, Flux prompts, and Kling motion.

Duration: ~15 min
Cost (30s short): $0.10
Skill: storyboard-forge
MCP tools: —

Every beat is decomposed into one or more shots. The storyboard-forge skill writes a full Flux.1 image prompt (with style tokens from the mascot sheet), a Kling motion description, a camera angle, and the shot's slice of the beat's duration.

Inscript.json

Reads from previous stage

{
  "id": "scr_2026_04_18_0001",
  "mascot": "kage",
  "beats": [
    {
      "beat_id": "b1",
      "timing_s": [0, 4],
      "visual_direction": "Wide. Figure alone on a rain-wet rooftop, sword sheathed."
    }
  ]
}

Outstoryboard.json

Writes for next stage + Grimoire

{
  "id": "sb_2026_04_18_0001",
  "script_id": "scr_2026_04_18_0001",
  "shots": [
    {
      "shot_id": "s1",
      "beat_id": "b1",
      "image_prompt": "wide cinematic shot, lone hooded swordsman on rain-slick Kyoto rooftop at night, sheathed katana at hip, chrome-edged armor catching neon glow, volumetric rain, anime-studio line art, cel-shaded, moody cobalt and obsidian palette, golden rim light, 2.35:1",
      "style_tokens": ["kage_core", "rain_neon_noir", "cel_shade_v3"],
      "motion_direction": "slow dolly forward, rain falls in parallax layers, subtle cape flutter",
      "camera_angle": "low three-quarter, 35mm equivalent",
      "duration_s": 4,
      "audio_cue": "sub-bass swell, distant thunder"
    }
  ]
}

Quality Gate

Every shot references a valid beat_id from the script.
Image prompts include mascot style tokens from the character sheet.
Sum of shot durations equals beat duration for each beat.
No prompt references copyrighted characters, studios, or franchises.

What Can Break

Prompt auto-flags a blocked term.
Fallback · The forbidden-terms filter (`brand/never-say.txt`) rewrites to archetype language and re-validates.
Shot count explodes past budget (>12 shots for 30s).
Fallback · Merge adjacent shots with same camera_angle and mascot. Flag for review if merge still exceeds budget.

Stage 05

Generate

Storyboard becomes voice, images, clips, and music.

Duration: ~12 min (parallel)
Cost (30s short): $2.35
Skill: generate-forge
MCP tools: 4 tools

The heavy stage. Four parallel jobs: ElevenLabs synthesizes the voiceover from the mascot's voice clone, Flux.1 batch-generates every shot's keyframe through the style LoRA, Kling 2.x animates each keyframe into a video clip, and Suno v4 scores a custom music bed keyed to the beat timing.

MCP Tool Calls

gen_voice()
gen_batch_images()
gen_video_clip()
gen_music()

Instoryboard.json

Reads from previous stage

{
  "id": "sb_2026_04_18_0001",
  "shots": [
    {
      "shot_id": "s1",
      "image_prompt": "wide cinematic shot, lone hooded swordsman ...",
      "motion_direction": "slow dolly forward, rain falls in parallax",
      "duration_s": 4
    }
  ]
}

Outassets.json

Writes for next stage + Grimoire

{
  "id": "ast_2026_04_18_0001",
  "storyboard_id": "sb_2026_04_18_0001",
  "frames": [
    {
      "asset_id": "img_s1_v1",
      "shot_id": "s1",
      "type": "keyframe",
      "file_path": "out/frames/s1_v1.webp",
      "provider": "flux-1.1-pro",
      "prompt": "wide cinematic shot, lone hooded swordsman ...",
      "seed": 1847203,
      "model_version": "flux-1.1-pro@2026-03-11",
      "lora": "kage_core_v3"
    }
  ],
  "video_clips": [
    {
      "asset_id": "vid_s1",
      "shot_id": "s1",
      "type": "clip",
      "file_path": "out/clips/s1.mp4",
      "provider": "kling-2.1",
      "source_frame": "img_s1_v1",
      "motion_prompt": "slow dolly forward, rain falls in parallax",
      "duration_s": 4,
      "seed": 447281,
      "model_version": "kling-2.1@2026-02-20"
    }
  ],
  "voice_clips": [
    {
      "asset_id": "vo_b1",
      "beat_id": "b1",
      "file_path": "out/vo/b1.mp3",
      "provider": "elevenlabs",
      "voice_id": "kage_v2_clone",
      "duration_s": 3.6
    }
  ],
  "music_beds": [
    {
      "asset_id": "mus_main",
      "file_path": "out/music/main.mp3",
      "provider": "suno-v4",
      "prompt": "cinematic taiko + sub-bass, 86 bpm, sparse, mythic",
      "duration_s": 30
    }
  ]
}

Quality Gate

Every shot has at least one keyframe asset.
Every beat has a voice clip within ±0.3s of beat duration.
Video clip duration matches shot duration exactly (re-render if drift).
Every asset has seed + provider + model_version logged — no exceptions.

What Can Break

Kling 2.x queue is backed up or fails.
Fallback · Fall back to Wan 2.2 using the same source frame and motion prompt. Provider field updated in Grimoire for lineage.
Flux generation drifts off-mascot.
Fallback · Retry with LoRA weight bumped +15%, seed rotated. After 3 tries, flag for human regeneration.
ElevenLabs voice clone sounds unnatural on a specific line.
Fallback · Regenerate with stability=0.6, similarity=0.85 fallback preset. Persist successful presets to the mascot voice card.

Stage 06

Assemble

Assets become a rendered MP4 via Remotion.

Duration: ~5 min
Cost (30s short): $0.00
Skill: remotion-composer
MCP tools: —

The remotion-composer skill maps every asset into the right Remotion composition slot, wires the beat timeline, mixes VO over music at -14 LUFS, renders headless, and writes the final MP4 to disk. No human touches this stage in the normal path.

Inscript.json + assets.json

Reads from previous stage

{
  "script": { "id": "scr_...", "beats": [...] },
  "assets": { "id": "ast_...", "frames": [...], "video_clips": [...] }
}

Outassembly.json

Writes for next stage + Grimoire

{
  "id": "asm_2026_04_18_0001",
  "assets_id": "ast_2026_04_18_0001",
  "remotion_composition": "PowerScalingShort",
  "composition_props": {
    "mascot": "kage",
    "beats": "[...]",
    "video_clips": "[...]"
  },
  "output_path": "out/renders/kage_quiet_swordsman_2026_04_18.mp4",
  "duration_s": 30,
  "render_settings": {
    "resolution": "1080x1920",
    "fps": 30,
    "crf": 18,
    "audio_bitrate": "320k",
    "audio_target_lufs": -14
  },
  "render_log": {
    "frames_rendered": 900,
    "wall_clock_s": 287,
    "errors": []
  }
}

Quality Gate

Render completes with zero errors.
Output duration within ±0.5s of script duration.
Integrated loudness within -14 LUFS ± 1 dB.
Output resolution matches platform target (1080x1920 for vertical).

What Can Break

Remotion render crashes mid-composition.
Fallback · Resume from the last successful frame via the Remotion cache. If cache is corrupt, full re-render with lower concurrency.
Asset file missing when composer tries to mount it.
Fallback · Re-run only the missing asset through Stage 5 (single-shot mode), then resume assembly.

Stage 07

Distribute

MP4 becomes posts on every platform that matters.

Duration: ~3 min
Cost (30s short): $0.00
Skill: publish-orchestrator
MCP tools: 4 tools

The publish-orchestrator skill crops and re-encodes for each platform's spec, writes the caption set (platform-specific hook, hashtag strategy, CTA), and schedules or publishes via each platform's API. Every post ID is logged back to Grimoire for performance retrieval.

MCP Tool Calls

publish_tiktok()
publish_youtube_shorts()
publish_instagram_reels()
publish_x()

Inassembly.json

Reads from previous stage

{
  "id": "asm_2026_04_18_0001",
  "output_path": "out/renders/kage_quiet_swordsman_2026_04_18.mp4",
  "duration_s": 30
}

Outpost.json

Writes for next stage + Grimoire

{
  "id": "pst_2026_04_18_0001",
  "assembly_id": "asm_2026_04_18_0001",
  "platforms": ["tiktok", "youtube_shorts", "instagram_reels", "x"],
  "post_ids": {
    "tiktok": "7349928472018334",
    "youtube_shorts": "dQw4w9W_legend",
    "instagram_reels": "C8xYq2kLmPq",
    "x": "1781409823741820"
  },
  "captions": {
    "tiktok": "The strongest blade is the one never drawn. #anime #ai #kage",
    "youtube_shorts": "When silence outranks steel. A legend in 30 seconds.",
    "x": "He never drew. He didn't have to."
  },
  "scheduled_at": "2026-04-18T18:00:00Z",
  "status": "scheduled"
}

Quality Gate

Every target platform receives a post_id or an explicit skip reason.
Captions pass the brand voice check (no hype, no cringe, no copy).
Hashtag set matches the channel strategy in `content/channel-strategy.md`.
Scheduled time falls inside the platform's audience-peak window.

What Can Break

Platform API returns auth error.
Fallback · Queue the post to the retry table; operator is paged via the Grimoire alert webhook.
Caption fails voice check.
Fallback · Regenerate using the stricter `brand/voice-compress.md` prompt. Max 2 retries, then human review.

Stage 08 · Always On

The Grimoire

Every prompt, every seed, every metric, every quality-gate result — logged. This is the asset that compounds while everyone else ships and forgets.

Every prompt

Full text, model version, provider, seed, and the LoRA weights active at generation time.

Every gen

Input → output binding, wall-clock time, cost in USD, and the quality-gate result that followed.

Every lineage

Signal → Post, end-to-end, with every intermediate manifest hash so remixing is just a tree walk.

Every metric

Per-platform views, watch-time curves, retention spikes, and comment sentiment — all joined to assets.

Schema

12 tables. One pgvector index.

Postgres is the spine. The 12 tables mirror the pipeline manifests 1:1 — signals, concepts, scripts, storyboards, assets, assemblies, posts — plus lineage, metrics, quality_events, embeddings, experiments, and lora_sessions.

The embeddings table uses pgvector for semantic retrieval — so “find shots that looked like this one and scored above 90th percentile” is a single query.

Postgres 16pgvectorSupabaseCloudflare R2Local JSON fallback

See grimoire/schema.sql in the repo for the complete DDL.

Outgrimoire_log.jsonl

One line per asset — append-only, full lineage preserved

{
  "log_id": "grm_2026_04_18_0001_img_s1_v1",
  "timestamp": "2026-04-18T09:31:47Z",
  "stage": "generate",
  "event": "asset.created",
  "lineage": {
    "signal_id":     "sig_2026_04_18_0001",
    "concept_id":    "con_2026_04_18_0001",
    "script_id":     "scr_2026_04_18_0001",
    "storyboard_id": "sb_2026_04_18_0001",
    "asset_id":      "img_s1_v1"
  },
  "provider": "flux-1.1-pro",
  "model_version": "flux-1.1-pro@2026-03-11",
  "lora": "kage_core_v3",
  "seed": 1847203,
  "prompt_hash": "sha256:9c4b…a102",
  "prompt_embedding_id": "emb_0001_8f2a",
  "wall_clock_s": 8.4,
  "cost_usd": 0.038,
  "quality_gate": {
    "mascot_consistency": 0.94,
    "copyright_scan":     "pass",
    "status":             "pass"
  },
  "output_path": "r2://animelegends/frames/s1_v1.webp"
}

Why it matters

The dataset that retrains the tools.

LoRA retraining
Top-performing shots (by retention x sentiment) become the training set for the next version of each mascot's style LoRA.
Example · KAGE core LoRA v3 was trained on the 184 highest-retention KAGE shots from v2. v3 lifted mean watch-time by 14%.
Performance Q&A
Ask natural questions against the dataset. The pipeline itself answers because every artifact is joined to its outcome.
Example · 'What hooks outperform on TikTok for stoic archetypes?' runs as one SQL over posts × embeddings × metrics.
Hook pattern learning
Hook-embedding clustering reveals which opening structures ship the retention spike. New scripts sample from winning clusters.
Example · The 'three {plural-noun}. three {plural-noun}.' pattern appeared in the top 5% of hooks — now a sampled template.

No Supabase key? Grimoire writes newline-delimited JSON to grimoire/local/*.jsonl instead. Same schema, same queries (via DuckDB) — the dataset never breaks because the infra isn't ready yet.

Clone the schema in the Forge

Transparency

$2.35 per short. ~50 minutes per run.

No smoke, no black boxes. Here's where every cent and every second goes for a 30-second vertical short.

API Cost

$2.52

per 30s short

01 Signal$0.00
02 Concept$0.02
03 Script$0.05
04 Storyboard$0.10
05 Generate$2.35
06 Assemble$0.00
07 Distribute$0.00

Stage 5 (Generate) is 93% of the bill. Everything else is reasoning tokens, router calls, or self-hosted compute. That's the right shape: we pay for pixels, not thinking.

Wall Clock

~52 min

end-to-end

01 Signal~2 min
02 Concept~5 min
03 Script~10 min
04 Storyboard~15 min
05 Generate~12 min (parallel)
06 Assemble~5 min
07 Distribute~3 min

Stage 5 runs four providers in parallel — sequential would double the clock. Stage 4 is the human-readable bottleneck (model reasoning, not billing), and it's worth every second: a bad storyboard poisons the full render.

$2.35

Direct API cost

Sum of all providers on a nominal run.

$0.14

Per 1,000 views

Assuming ~17k avg views/short. The flywheel is profitable at 3k views.

0 hrs

Of human editing

Operators review outputs; they don't touch timelines. That's the point.

See the pipeline ship something live

Watch it run, or run it yourself.

Every stage you just read is open and reproducible. Pick a side: consume the output, or clone the engine.

See shipped shorts Run it yourself

The Forge ships SKILL.md files, slash commands, Remotion templates, and MCP servers — everything you need to reproduce the pipeline in your own repo. MIT-compatible license.

Seven stages.One compoundingdataset.

Signal

Concept

Script

Storyboard

Generate

Assemble

Distribute

Grimoire

Grimoire

MCP Tool Calls

MCP Tool Calls

MCP Tool Calls

Every prompt

Every gen

Every lineage

Every metric

12 tables. One pgvector index.

The dataset that retrains the tools.

LoRA retraining

Performance Q&A

Hook pattern learning

$2.52

~52 min

Watch it run, or run it yourself.