Seven stages.One compoundingdataset.
Every artifact from every short logs to Grimoire. The pipeline teaches itself — then retrains the tools that made it.
- Stages
- 7
- Per 30s short
- $2.52
- Wall clock
- ~52 min
Hover a node. Click to descend.
Each stage reads a typed JSON manifest from the previous stage and writes one for the next. Grimoire logs every artifact, in parallel, always.
Signal
Capture the trending topic that becomes a legend.
- Duration
- ~2 min
- Cost (30s short)
- $0.00
- Skill
- MCP direct
- MCP tools
- 2 tools
Every legend begins as a whisper in the culture. Stage one pulls live trending data from TikTok, YouTube, Reddit, and X, scores each topic against our archetype taxonomy, and selects the signal with the highest legend-potential — never the highest raw volume.
MCP Tool Calls
signal_get_trending()signal_score_topic()
This stage is the entry point. It pulls live data from trending sources rather than reading a prior manifest.
Writes for next stage + Grimoire
{
"id": "sig_2026_04_18_0001",
"topic": "rise-of-the-quiet-swordsman",
"hook_angle": "The strongest blade is the one never drawn.",
"trending_score": 0.87,
"sources": [
{ "platform": "tiktok", "volume": 41200, "velocity": 2.3 },
{ "platform": "reddit", "volume": 8900, "velocity": 1.7 },
{ "platform": "x", "volume": 15600, "velocity": 1.9 }
],
"archetype_match": "stoic_warrior",
"timestamp": "2026-04-18T09:22:14Z"
}Quality Gate
- Trending score >= 0.6 (below that, topic is noise).
- At least two source platforms corroborate the signal.
- Archetype matches an entry in `brand/archetypes.yaml`.
- No copyrighted character names appear anywhere in the topic string.
What Can Break
All scraped topics score below threshold.
Fallback · Fall back to the evergreen archetype rotation in `grimoire/evergreen_archetypes.sql` — we always have a legend to tell.
Source API rate-limited.
Fallback · Cached trending window from the last successful pull (max 6h stale), with a warning logged to Grimoire.
Concept
Signal becomes logline, mascot, and duration target.
- Duration
- ~5 min
- Cost (30s short)
- $0.02
- Skill
concept-forge- MCP tools
- —
Pure reasoning stage. The concept-forge skill takes the signal and forges a one-sentence logline, assigns the right mascot (AKASHI for cosmic narration, KAGE for stoic analysis, MIRA for curious wonder), picks the primary platform, and decides how long the short should breathe.
Reads from previous stage
{
"id": "sig_2026_04_18_0001",
"topic": "rise-of-the-quiet-swordsman",
"hook_angle": "The strongest blade is the one never drawn.",
"archetype_match": "stoic_warrior"
}Writes for next stage + Grimoire
{
"id": "con_2026_04_18_0001",
"signal_id": "sig_2026_04_18_0001",
"logline": "A swordsman who refuses to draw becomes the one everyone fears.",
"mascot": "kage",
"platform_primary": "tiktok",
"duration_target_s": 30,
"hook": "What if the strongest fighter never swings?",
"payoff": "Because the threat of the blade is louder than the blade.",
"archetype_language": [
"the stoic warrior",
"restraint as power",
"the unsheathed answer"
]
}Quality Gate
- Logline is a single sentence, under 18 words.
- Mascot field matches a character sheet in `mascots/`.
- No copyrighted character names — archetype language only.
- Duration target within {15, 30, 45, 60} seconds.
What Can Break
Logline fails the copyright scan.
Fallback · Regenerate with an explicit constraint injected into the prompt listing the blocked names. Max 3 retries, then skip the signal.
Mascot assignment is ambiguous.
Fallback · Default to AKASHI (the cosmic narrator works for any archetype) and flag for human review in Grimoire.
Script
Concept becomes beats, VO lines, and on-screen text.
- Duration
- ~10 min
- Cost (30s short)
- $0.05
- Skill
script-forge- MCP tools
- —
The script-forge skill expands the concept into a beat sheet. Each beat is timed to the second, carries the voiceover line, the on-screen text overlay, the visual direction for storyboard, and an audio cue for the music bed.
Reads from previous stage
{
"id": "con_2026_04_18_0001",
"logline": "A swordsman who refuses to draw becomes the one everyone fears.",
"mascot": "kage",
"duration_target_s": 30
}Writes for next stage + Grimoire
{
"id": "scr_2026_04_18_0001",
"concept_id": "con_2026_04_18_0001",
"mascot": "kage",
"duration_s": 30,
"style_notes": "Measured cadence. Pauses over emphasis. Let silence strike.",
"beats": [
{
"beat_id": "b1",
"timing_s": [0, 4],
"vo_text": "They called him the quiet one.",
"on_screen_text": "THE QUIET ONE",
"visual_direction": "Wide. Figure alone on a rain-wet rooftop, sword sheathed.",
"audio_cue": "sub-bass swell, distant thunder"
},
{
"beat_id": "b2",
"timing_s": [4, 10],
"vo_text": "Three masters came. Three masters left.",
"on_screen_text": "3 MASTERS",
"visual_direction": "Cut between three silhouettes approaching, then retreating.",
"audio_cue": "taiko hit per master, room tone between"
}
],
"voiceover_lines": [
"They called him the quiet one.",
"Three masters came. Three masters left."
],
"on_screen_text": ["THE QUIET ONE", "3 MASTERS"]
}Quality Gate
- Total timing of beats equals duration_s within ±0.5s.
- Every beat has vo_text, on_screen_text, visual_direction, audio_cue.
- Voice cadence matches the mascot's voice card in `mascots/<name>/voice.md`.
- Copyright scan passes on all vo_text and on_screen_text.
What Can Break
Beat timings drift outside the duration target.
Fallback · Automatic rebalance pass — trim the longest beat by the drift amount, reflow timings.
On-screen text exceeds 4 words per beat.
Fallback · Compression pass using the `brand/voice.md` compression rules. Fail build if no compression possible.
Storyboard
Script becomes shots, Flux prompts, and Kling motion.
- Duration
- ~15 min
- Cost (30s short)
- $0.10
- Skill
storyboard-forge- MCP tools
- —
Every beat is decomposed into one or more shots. The storyboard-forge skill writes a full Flux.1 image prompt (with style tokens from the mascot sheet), a Kling motion description, a camera angle, and the shot's slice of the beat's duration.
Reads from previous stage
{
"id": "scr_2026_04_18_0001",
"mascot": "kage",
"beats": [
{
"beat_id": "b1",
"timing_s": [0, 4],
"visual_direction": "Wide. Figure alone on a rain-wet rooftop, sword sheathed."
}
]
}Writes for next stage + Grimoire
{
"id": "sb_2026_04_18_0001",
"script_id": "scr_2026_04_18_0001",
"shots": [
{
"shot_id": "s1",
"beat_id": "b1",
"image_prompt": "wide cinematic shot, lone hooded swordsman on rain-slick Kyoto rooftop at night, sheathed katana at hip, chrome-edged armor catching neon glow, volumetric rain, anime-studio line art, cel-shaded, moody cobalt and obsidian palette, golden rim light, 2.35:1",
"style_tokens": ["kage_core", "rain_neon_noir", "cel_shade_v3"],
"motion_direction": "slow dolly forward, rain falls in parallax layers, subtle cape flutter",
"camera_angle": "low three-quarter, 35mm equivalent",
"duration_s": 4,
"audio_cue": "sub-bass swell, distant thunder"
}
]
}Quality Gate
- Every shot references a valid beat_id from the script.
- Image prompts include mascot style tokens from the character sheet.
- Sum of shot durations equals beat duration for each beat.
- No prompt references copyrighted characters, studios, or franchises.
What Can Break
Prompt auto-flags a blocked term.
Fallback · The forbidden-terms filter (`brand/never-say.txt`) rewrites to archetype language and re-validates.
Shot count explodes past budget (>12 shots for 30s).
Fallback · Merge adjacent shots with same camera_angle and mascot. Flag for review if merge still exceeds budget.
Generate
Storyboard becomes voice, images, clips, and music.
- Duration
- ~12 min (parallel)
- Cost (30s short)
- $2.35
- Skill
generate-forge- MCP tools
- 4 tools
The heavy stage. Four parallel jobs: ElevenLabs synthesizes the voiceover from the mascot's voice clone, Flux.1 batch-generates every shot's keyframe through the style LoRA, Kling 2.x animates each keyframe into a video clip, and Suno v4 scores a custom music bed keyed to the beat timing.
MCP Tool Calls
gen_voice()gen_batch_images()gen_video_clip()gen_music()
Reads from previous stage
{
"id": "sb_2026_04_18_0001",
"shots": [
{
"shot_id": "s1",
"image_prompt": "wide cinematic shot, lone hooded swordsman ...",
"motion_direction": "slow dolly forward, rain falls in parallax",
"duration_s": 4
}
]
}Writes for next stage + Grimoire
{
"id": "ast_2026_04_18_0001",
"storyboard_id": "sb_2026_04_18_0001",
"frames": [
{
"asset_id": "img_s1_v1",
"shot_id": "s1",
"type": "keyframe",
"file_path": "out/frames/s1_v1.webp",
"provider": "flux-1.1-pro",
"prompt": "wide cinematic shot, lone hooded swordsman ...",
"seed": 1847203,
"model_version": "flux-1.1-pro@2026-03-11",
"lora": "kage_core_v3"
}
],
"video_clips": [
{
"asset_id": "vid_s1",
"shot_id": "s1",
"type": "clip",
"file_path": "out/clips/s1.mp4",
"provider": "kling-2.1",
"source_frame": "img_s1_v1",
"motion_prompt": "slow dolly forward, rain falls in parallax",
"duration_s": 4,
"seed": 447281,
"model_version": "kling-2.1@2026-02-20"
}
],
"voice_clips": [
{
"asset_id": "vo_b1",
"beat_id": "b1",
"file_path": "out/vo/b1.mp3",
"provider": "elevenlabs",
"voice_id": "kage_v2_clone",
"duration_s": 3.6
}
],
"music_beds": [
{
"asset_id": "mus_main",
"file_path": "out/music/main.mp3",
"provider": "suno-v4",
"prompt": "cinematic taiko + sub-bass, 86 bpm, sparse, mythic",
"duration_s": 30
}
]
}Quality Gate
- Every shot has at least one keyframe asset.
- Every beat has a voice clip within ±0.3s of beat duration.
- Video clip duration matches shot duration exactly (re-render if drift).
- Every asset has seed + provider + model_version logged — no exceptions.
What Can Break
Kling 2.x queue is backed up or fails.
Fallback · Fall back to Wan 2.2 using the same source frame and motion prompt. Provider field updated in Grimoire for lineage.
Flux generation drifts off-mascot.
Fallback · Retry with LoRA weight bumped +15%, seed rotated. After 3 tries, flag for human regeneration.
ElevenLabs voice clone sounds unnatural on a specific line.
Fallback · Regenerate with stability=0.6, similarity=0.85 fallback preset. Persist successful presets to the mascot voice card.
Assemble
Assets become a rendered MP4 via Remotion.
- Duration
- ~5 min
- Cost (30s short)
- $0.00
- Skill
remotion-composer- MCP tools
- —
The remotion-composer skill maps every asset into the right Remotion composition slot, wires the beat timeline, mixes VO over music at -14 LUFS, renders headless, and writes the final MP4 to disk. No human touches this stage in the normal path.
Reads from previous stage
{
"script": { "id": "scr_...", "beats": [...] },
"assets": { "id": "ast_...", "frames": [...], "video_clips": [...] }
}Writes for next stage + Grimoire
{
"id": "asm_2026_04_18_0001",
"assets_id": "ast_2026_04_18_0001",
"remotion_composition": "PowerScalingShort",
"composition_props": {
"mascot": "kage",
"beats": "[...]",
"video_clips": "[...]"
},
"output_path": "out/renders/kage_quiet_swordsman_2026_04_18.mp4",
"duration_s": 30,
"render_settings": {
"resolution": "1080x1920",
"fps": 30,
"crf": 18,
"audio_bitrate": "320k",
"audio_target_lufs": -14
},
"render_log": {
"frames_rendered": 900,
"wall_clock_s": 287,
"errors": []
}
}Quality Gate
- Render completes with zero errors.
- Output duration within ±0.5s of script duration.
- Integrated loudness within -14 LUFS ± 1 dB.
- Output resolution matches platform target (1080x1920 for vertical).
What Can Break
Remotion render crashes mid-composition.
Fallback · Resume from the last successful frame via the Remotion cache. If cache is corrupt, full re-render with lower concurrency.
Asset file missing when composer tries to mount it.
Fallback · Re-run only the missing asset through Stage 5 (single-shot mode), then resume assembly.
Distribute
MP4 becomes posts on every platform that matters.
- Duration
- ~3 min
- Cost (30s short)
- $0.00
- Skill
publish-orchestrator- MCP tools
- 4 tools
The publish-orchestrator skill crops and re-encodes for each platform's spec, writes the caption set (platform-specific hook, hashtag strategy, CTA), and schedules or publishes via each platform's API. Every post ID is logged back to Grimoire for performance retrieval.
MCP Tool Calls
publish_tiktok()publish_youtube_shorts()publish_instagram_reels()publish_x()
Reads from previous stage
{
"id": "asm_2026_04_18_0001",
"output_path": "out/renders/kage_quiet_swordsman_2026_04_18.mp4",
"duration_s": 30
}Writes for next stage + Grimoire
{
"id": "pst_2026_04_18_0001",
"assembly_id": "asm_2026_04_18_0001",
"platforms": ["tiktok", "youtube_shorts", "instagram_reels", "x"],
"post_ids": {
"tiktok": "7349928472018334",
"youtube_shorts": "dQw4w9W_legend",
"instagram_reels": "C8xYq2kLmPq",
"x": "1781409823741820"
},
"captions": {
"tiktok": "The strongest blade is the one never drawn. #anime #ai #kage",
"youtube_shorts": "When silence outranks steel. A legend in 30 seconds.",
"x": "He never drew. He didn't have to."
},
"scheduled_at": "2026-04-18T18:00:00Z",
"status": "scheduled"
}Quality Gate
- Every target platform receives a post_id or an explicit skip reason.
- Captions pass the brand voice check (no hype, no cringe, no copy).
- Hashtag set matches the channel strategy in `content/channel-strategy.md`.
- Scheduled time falls inside the platform's audience-peak window.
What Can Break
Platform API returns auth error.
Fallback · Queue the post to the retry table; operator is paged via the Grimoire alert webhook.
Caption fails voice check.
Fallback · Regenerate using the stricter `brand/voice-compress.md` prompt. Max 2 retries, then human review.
The Grimoire
Every prompt, every seed, every metric, every quality-gate result — logged. This is the asset that compounds while everyone else ships and forgets.
Every prompt
Full text, model version, provider, seed, and the LoRA weights active at generation time.
Every gen
Input → output binding, wall-clock time, cost in USD, and the quality-gate result that followed.
Every lineage
Signal → Post, end-to-end, with every intermediate manifest hash so remixing is just a tree walk.
Every metric
Per-platform views, watch-time curves, retention spikes, and comment sentiment — all joined to assets.
12 tables. One pgvector index.
Postgres is the spine. The 12 tables mirror the pipeline manifests 1:1 — signals, concepts, scripts, storyboards, assets, assemblies, posts — plus lineage, metrics, quality_events, embeddings, experiments, and lora_sessions.
The embeddings table uses pgvector for semantic retrieval — so “find shots that looked like this one and scored above 90th percentile” is a single query.
See grimoire/schema.sql in the repo for the complete DDL.
One line per asset — append-only, full lineage preserved
{
"log_id": "grm_2026_04_18_0001_img_s1_v1",
"timestamp": "2026-04-18T09:31:47Z",
"stage": "generate",
"event": "asset.created",
"lineage": {
"signal_id": "sig_2026_04_18_0001",
"concept_id": "con_2026_04_18_0001",
"script_id": "scr_2026_04_18_0001",
"storyboard_id": "sb_2026_04_18_0001",
"asset_id": "img_s1_v1"
},
"provider": "flux-1.1-pro",
"model_version": "flux-1.1-pro@2026-03-11",
"lora": "kage_core_v3",
"seed": 1847203,
"prompt_hash": "sha256:9c4b…a102",
"prompt_embedding_id": "emb_0001_8f2a",
"wall_clock_s": 8.4,
"cost_usd": 0.038,
"quality_gate": {
"mascot_consistency": 0.94,
"copyright_scan": "pass",
"status": "pass"
},
"output_path": "r2://animelegends/frames/s1_v1.webp"
}The dataset that retrains the tools.
LoRA retraining
Top-performing shots (by retention x sentiment) become the training set for the next version of each mascot's style LoRA.
Example · KAGE core LoRA v3 was trained on the 184 highest-retention KAGE shots from v2. v3 lifted mean watch-time by 14%.
Performance Q&A
Ask natural questions against the dataset. The pipeline itself answers because every artifact is joined to its outcome.
Example · 'What hooks outperform on TikTok for stoic archetypes?' runs as one SQL over posts × embeddings × metrics.
Hook pattern learning
Hook-embedding clustering reveals which opening structures ship the retention spike. New scripts sample from winning clusters.
Example · The 'three {plural-noun}. three {plural-noun}.' pattern appeared in the top 5% of hooks — now a sampled template.
No Supabase key? Grimoire writes newline-delimited JSON to grimoire/local/*.jsonl instead. Same schema, same queries (via DuckDB) — the dataset never breaks because the infra isn't ready yet.
$2.35 per short. ~50 minutes per run.
No smoke, no black boxes. Here's where every cent and every second goes for a 30-second vertical short.
$2.52
- 01 Signal$0.00
- 02 Concept$0.02
- 03 Script$0.05
- 04 Storyboard$0.10
- 05 Generate$2.35
- 06 Assemble$0.00
- 07 Distribute$0.00
Stage 5 (Generate) is 93% of the bill. Everything else is reasoning tokens, router calls, or self-hosted compute. That's the right shape: we pay for pixels, not thinking.
~52 min
- 01 Signal~2 min
- 02 Concept~5 min
- 03 Script~10 min
- 04 Storyboard~15 min
- 05 Generate~12 min (parallel)
- 06 Assemble~5 min
- 07 Distribute~3 min
Stage 5 runs four providers in parallel — sequential would double the clock. Stage 4 is the human-readable bottleneck (model reasoning, not billing), and it's worth every second: a bad storyboard poisons the full render.
Sum of all providers on a nominal run.
Assuming ~17k avg views/short. The flywheel is profitable at 3k views.
Operators review outputs; they don't touch timelines. That's the point.
Watch it run, or run it yourself.
Every stage you just read is open and reproducible. Pick a side: consume the output, or clone the engine.
The Forge ships SKILL.md files, slash commands, Remotion templates, and MCP servers — everything you need to reproduce the pipeline in your own repo. MIT-compatible license.
