ChannelHelm understands a video through a four-layer pipeline (audio → visual → fusion → intelligence) and drafts every asset for every platform — locally, on your own Mac. This page is the picture-version of what's built and what happens behind that "drop or paste a link" button.
A Next.js dashboard, a custom job queue, a fleet of small workers, four Python ML CLIs, and a couple of external services for publishing — all running locally.
Every piece below is on main: typecheck/lint/build clean, with a real Postgres test suite (Vitest + Testcontainers).
Plan, Produce, Publish, and Channel surfaces now wrap the original Studio. Plan covers ideas/scripts/mining; Produce keeps ingest and package work; Publish covers review/dispatch/calendar; Channel covers linear/FAST programming, HLS, analytics, and restream targets.
Drop a file or paste a URL. YouTube links auto-detect the brand from the channel (yt-dlp + website fallback). Uploads stream to MEDIA_ROOT under a CSRF guard with a hard size cap.
Four layers: audio (transcribe + diarize), visual (frames + OCR + VLM), fusion (scene log), intelligence (LLM analysis). Visual phase: scene-cut-driven VLM keyframes (vs fps=1), 768 px downscale, OCR ∥ VLM, and profile-aware OCR fps (0.5 standard / 1 premium) — ~12–14× faster than the original. Workers claim jobs with SKIP LOCKED; stale locks self-reclaim.
Custom ~150-line layer on Postgres. Idempotency keys de-dupe; per-kind lock timeouts requeue crashed jobs; RequeueLater defers rate-limited dispatches to the next UTC day instead of failing. Worker runs N concurrent claim slots (default 3, tune via WORKER_CONCURRENCY) — SKIP LOCKED is the only mutex; the 10 generate_asset jobs per package finish ~3× faster.
OpenAI · Anthropic · OpenRouter · Ollama · LM Studio · OpenClaw · Codex CLI. Quick Presets, Fetch Models, per-profile routing. API keys encrypted at rest + never serialized to the client.
Editorial → DojoClaw §8.2 (/api/v1/articles/from-brief). Social → Zernio §9.3 (per-account platforms[], ChannelHelm metadata, X threads, signed clip URLs §9.4, 24-h rate limit §9.6). YouTube → direct upload via per-brand OAuth (defaults to private).
Per-Short workspace: word-snap trim, captions, animated ASS subtitle styles, live overlay preview, dual-crop, active-speaker reframe, b-roll compositing, redaction, and per-clip publish options. The short_clip_plan is the editable truth; clip_render UPSERTs the build output and bumps render_rev.
When a clip plan lands, generate_asset fans out one clip_render per clip — so you open the Shorts tab to ready-to-preview MP4s, not an empty "click Render" list. Renders still need separate approval before dispatch.
Inline Stage-1 cleanup removes pipeline scratch the moment its last consumer is done. An archive_package worker moves published media to ARCHIVE_ROOT after N days. Postgres stays the permanent audit trail. Full map →
HMAC-signed, fail-closed. Correlate by external id or metadata. Duplicate deliveries swallowed by a unique (source, source_event_id) index.
Your media never leaves your Mac unless you publish. Dashboard pages use local sessions and brand-scoped roles; worker/API paths keep bearer-token auth. MEDIA_ROOT traversal guards, optional signed /media/* URLs, and package/source brand consistency remain enforced.
Static pages in public/ — landing (index.html), Privacy, Terms, Contact, Impressum — styled in the same identity, ready for any web host that serves index.html.
Each step shows the worker that runs it, the file/DB state it produces, and what the operator sees.
For a YouTube link, the brand is auto-detected from the channel (yt-dlp metadata → match by youtube_channel_id → by website domain → else auto-create). For a file, the browser streams the body to /api/uploads behind a same-origin guard with a hard size cap. Either way a sources row, a packages row, and an ingest job are created, and you land on /packages/[id].
ingest downloads or probes the file, extracts audio.wav, detects scene cuts. transcribe_audio + analyze_visual run in parallel via uv run python ml/*. fuse stitches their outputs into a scene log. analyze_intelligence calls the LLM for topics/hooks/retention, then fans out one generate_asset per type (titles/description/tags/chapters/linkedin/x/article…) and a thumbnail_concepts job that generates AI thumbnail images (when an image provider is configured) — or falls back to frame extraction at hook timestamps.
The Console layout shows the platform rail, the video player, a live 4-layer pipeline indicator, and an asset stack: Titles (scored 0–100), Description (with chapters + hashtags), Tags (scored pills), Transcript, Thumbnails. Each card has Copy / Regenerate; empty sections offer Generate (works straight from the transcript). Right pane: approval.
Approving an asset enqueues dispatch:{assetId}. Approving the whole package approves every non-plan asset and enqueues a dispatch each; *_plan assets enqueue one clip_render per clip (plans never dispatch). Dispatch routes by type: DojoClaw for articles, Zernio for social/clips (per-account platforms + ChannelHelm metadata + signed media URLs + 24-h rate limit), and direct upload for YouTube (per-brand OAuth, default private). Daily limit hit → RequeueLater to next UTC midnight.
Signed callbacks land at /api/webhooks/{zernio,dojoclaw} (HMAC-verified, fail-closed). The processor correlates by external id or metadata, flips the asset to published / failed, records WordPress URL / post URL, and writes analytics into signals. The package status follows.
Each layer produces a concrete artifact that feeds the next. The same indicator runs across the dashboard — compact rows on the package list, an expanded card in the Studio.
The thumbnail_concepts worker now generates AI thumbnail images. The LLM turns the package analysis into N distinct visual concepts + ≤4-word headlines, your image provider renders each, and ffmpeg produces two variants per concept for you to pick from. With no image provider configured it falls back to the original frame-extraction at hook timestamps — so it still works zero-config. Audio-only profiles skip thumbnails entirely.
After review and approval, the dispatch worker routes each asset by type. Editorial goes to your local DojoClaw; social posts and rendered clips go to Zernio's cloud; YouTube uploads go direct via per-brand OAuth. *_plan assets never dispatch — they feed clip_render.
short_clip_plan never leaves the building — it renders into a rendered_short_clip, which does. Green = webhooks closing the loop to published.Each clip in a short_clip_plan opens a full editor: scrub the timeline, trim to whole words, restyle subtitles live, write a description, and publish — all before a single frame is re-rendered. The plan is the source of truth; the render is a disposable build output.
clip_render rebuilds the MP4 (bumping render_rev, reusing the asset id so publish history stays bound). The live overlay means most styling iterations need zero renders. Shorts editor guide →A package moves through these in a strict order (workers advance it; the dashboard renders these pills everywhere). Two terminals after dispatch: full success or partial.
ready_for_review.Each of these is its own diagrammed deep-dive into one part of the system.
The full operator manual — every screen, asset type, and workflow.
⚙Install on your Mac: Postgres, workers, Python ML CLIs, launchd.
✂Word-snap trim, 6 subtitle animations, live preview, per-clip publish.
🔌Per-purpose provider routing, presets, at-rest key encryption.
▶Direct upload, per-brand OAuth, privacy defaults, metadata mapping.
🆎Self-run title/thumbnail rotation, YouTube Analytics winner, feedback loop.
⌘The SKIP LOCKED queue and N concurrent claim slots.
▦Scene-cut VLM sampling — ~10–14× faster visual phase.
🗄What's temporary, archivable, and permanent — plus cleanup options.
✅How a package is judged ready to ship to YouTube.
The whole thing runs on your Mac. Drop a video and watch the cards fill in as the pipeline completes.
▶ Open the dashboard