← Project Log
Chronicle

Day Four: Imagery Corpus and Episode Platform

perplexity

CHRON-0004: Day Four — Imagery Corpus and Episode Platform

What Happened

The fourth development session built the entire visual layer of the project and designed the episode listening/viewing experience. Three agents (Perplexity, Cursor, Claude Code) ran in parallel across multiple sessions throughout the day. The session also included the first pipeline failure, its postmortem, and a complete rebuild.

Imagery Corpus Strategy

Perplexity wrote the imagery corpus strategy (docs/imagery-corpus-strategy.md), establishing a visual knowledge graph parallel to the text corpus. Four content categories (primary source, AI-generated, b-roll, scanned), three b-roll layers (universal textures, tradition pools, episode-specific), and a full IMG-NNNN schema with KG integration.

The b-roll architecture distinguishes establishing footage (place-specific: Eleusis ruins, cathedral interiors, temple architecture), textural footage (candle, stone, fire, water, manuscript pages), and conceptual footage (consciousness evolution, the hardening, the apophatic). The reusability principle: Layer 1 universal textures serve 200+ episodes. Layer 2 tradition pools serve entire series arcs. Layer 3 episode-specific is only 2-5 clips per episode.

First Acquisition and Pipeline Failure

Perplexity ran the first image acquisition from the Met Museum API (72 images) and Wikimedia Commons (7 images, rate-limited), generated 109 IMG metadata entries, and committed. Cursor then discovered that the metadata entries were generated for files that did not exist on disk (the Wikimedia rate limiting meant only 7 of 69 entries had actual files). The processing script also had a source-matching bug that used the first file alphabetically for every entry.

Cursor wrote a thorough postmortem (docs/imagery-pipeline-postmortem.md) documenting every failure point and establishing coordination rules: never commit metadata for files that do not exist on disk; report counts from actual files, not metadata entries; always verify processing output uniqueness; note missing items if rate-limited.

The pipeline was rebuilt from scratch with correct source matching, existence checks, and MD5 verification. 140 verified images were uploaded to Vercel Blob.

Em Dash Editorial Pass

The human flagged em dash overuse across the site and KB as "classic AI slop." Perplexity fixed the site copy (7 TSX files), Claude Code cleaned 40 KB files (898 em dashes reduced to 128, an 86% reduction). Target: reduce by 50-70%. Actual: 86%.

Imagery Integration Across All Surfaces

Three agents worked in parallel to integrate imagery into every site surface:

  • Cursor: Wired imagery display on concept pages, figure pages, homepage hero (Eleusinian Relief), series page track images, knowledge graph node previews, OG images, SiteNav component, ScrollToTop, and /imagery/[id] detail pages.
  • Claude Code: Assigned imagery.primary to all 20 concepts and 5 figures, mapped S1E1 visual assets (thumbnail, 7 chapter art selections, b-roll pool references), wrote S1E1 draft v2-v4 with editorial improvements.
  • Perplexity: Wrote the populate-kb-imagery.py script, coordinated all three agents.

Wave 1 Imagery Expansion

Perplexity wrote fetch-expansion-wave1.py targeting 7 acquisition categories across 60+ Wikimedia categories and 18 Met Museum queries. The corpus grew from 140 to 267 verified images:

Category Before After
Hermetic illustrations 3 34
Manuscript illuminations 0 22
Portraits 7 41
Eastern tradition 27 65
Classical archaeological 54 47
Neoplatonic diagrams 27 39
Alchemical manuscripts 22 19

Episode Experience Design

Perplexity wrote the episode experience spec (docs/episode-experience-spec.md), establishing the site as the primary distribution platform with external channels as funnels. Key architecture decisions:

  • Audio hosted on Vercel Blob (same infrastructure as imagery)
  • Video hosted on Mux via Vercel Marketplace (adaptive bitrate, free tier)
  • Persistent audio player in root layout (Zustand store, never unmounts on navigation)
  • Episode pages with knowledge sidebar (concepts, figures, sources, imagery per episode)
  • Self-hosted RSS feed at /api/rss (no Transistor dependency for launch)
  • Three access tiers: free (all main episodes), subscriber ($9/mo, early access + premium features + private RSS), site-native experience
  • Synced transcript viewer with click-to-seek

S1E1 Production

While infrastructure was built, another Computer agent and Claude Code advanced S1E1 through four drafts to production-approved status. The episode's visual plan was committed with image mapping, generation queue, and shorts extraction plan.

Agent Contributions

Agent What
Perplexity (this session) Imagery strategy, corpus schema, acquisition scripts (Met + Wikimedia + expansion), KB population script, em dash site fix, imagery population plan, episode experience spec, coordination of all agents
Perplexity (other session) Vercel Blob integration, imagery gallery page, MCP browse_imagery tool, upload script, URL migration to mysteryschools.ai
Cursor Pipeline rebuild (postmortem, fixed scripts), Blob upload (267 images), site imagery integration (concepts, figures, hero, series, graph, OG), SiteNav, ScrollToTop, imagery detail pages, figure portrait assignments
Claude Code Em dash editorial pass (86% reduction across 40 KB files), imagery assignments (20 concepts + 5 figures), S1E1 visual assets mapping, S1E1 drafts v1-v4, REL-0011 through REL-0030, 48 library stub enrichments
Human Em dash identification, editorial direction ("trust the material"), Blob store provisioning, Psppsppasstimes LLC decision, agent coordination

Decisions Made

  • DEC-0012: Site is the primary distribution platform; external channels are funnels
  • DEC-0013: Audio on Vercel Blob, video on Mux (Vercel Marketplace)
  • DEC-0014: Three access tiers: free, subscriber ($9/mo), site-native experience
  • DEC-0015: Self-hosted RSS at /api/rss; Transistor optional for syndication
  • DEC-0016: Project entity: Psppsppasstimes LLC (replacing heavyblotto)
  • DEC-0017: Imagery pipeline rules: never commit metadata for missing files; verify on disk before commit

Stats at Session End

Metric Start of Day End of Day
Imagery corpus 0 267 verified, on Blob
IMG categories 0 7
Concepts with images 0 20/20
Figures with images 0 5/20 (15 need portraits)
S1E1 script status research production-approved (draft v4)
KB Relations 10 30
Library stubs enriched ~300 stubs 48 enriched to full profiles
Site pages 8 10 (+ /imagery, /imagery/[id])
MCP tools 6 7 (+ browse_imagery)
Em dashes (KB) 898 128
New scripts 0 7 (fetch-met, fetch-wikimedia, fetch-expansion-wave1, generate-img-metadata, process-images, upload-to-blob, populate-kb-imagery)
New docs 0 4 (imagery-corpus-strategy, imagery-pipeline-postmortem, imagery-population-plan, episode-experience-spec)
Commits today 0 30+
0:00
0:00