How it works

Drop a video. Get a publishing kit.

ChannelHelm understands a video through a four-layer pipeline (audio → visual → fusion → intelligence) and drafts every asset for every platform — locally, on your own Mac. This page is the picture-version of what's built and what happens behind that "drop or paste a link" button.

System map

The pieces, and how they talk.

A Next.js dashboard, a custom job queue, a fleet of small workers, four Python ML CLIs, and a couple of external services for publishing — all running locally.

ChannelHelm system map Operator You browser · local session Dashboard (Next.js 15) Plan · Studio · Channel Server components + actions /api/uploads · /api/media /api/webhooks/{zernio,dojoclaw} users · roles · live events Postgres 16 (local) Drizzle ORM brands · sources · packages assets · jobs · dispatches ideas · scripts · channels users · signals · restream_targets Job queue SKIP LOCKED enqueue + claim idempotency keys stale-lock reclaim RequeueLater (rate limits) Workers (one Node proc per kind) tsx workers/runner.ts ingest transcribe_audio analyze_visual fuse analyze_intelligence generate_asset · thumbnail_concepts clip_render dispatch collect_signal · channel_compile trend_digest · restream_push Provenance on every artifact {provider, model, host, prompt_version, …} ML CLIs (Python, uv) ml/*.py transcribe.py (MLX Whisper) describe_frames.py (mlx-vlm) ocr.py (Apple Vision) diarize.py (pyannote + align) External LLM provider OpenAI · Anthropic LM Studio · Codex Zernio (LATE) socials · 15 nets posts.create §9.3 DojoClaw articles (local LAN) from-brief §8.2 Local storage MEDIA_ROOT/ {brand}/{src}/ original.mp4 audio.wav clips/clip_NNN.mp4 webhooks (HMAC, fail-closed)
Browser → Next.js → queue → workers. Workers call out to your LLM provider, Zernio, DojoClaw, and the local ML CLIs. Webhooks come back through /api/webhooks (signed).
What's developed

The product, in a dozen bricks.

Every piece below is on main: typecheck/lint/build clean, with a real Postgres test suite (Vitest + Testcontainers).

Lifecycle dashboard

Plan, Produce, Publish, and Channel surfaces now wrap the original Studio. Plan covers ideas/scripts/mining; Produce keeps ingest and package work; Publish covers review/dispatch/calendar; Channel covers linear/FAST programming, HLS, analytics, and restream targets.

Ingest

Drop a file or paste a URL. YouTube links auto-detect the brand from the channel (yt-dlp + website fallback). Uploads stream to MEDIA_ROOT under a CSRF guard with a hard size cap.

Pipeline

Four layers: audio (transcribe + diarize), visual (frames + OCR + VLM), fusion (scene log), intelligence (LLM analysis). Visual phase: scene-cut-driven VLM keyframes (vs fps=1), 768 px downscale, OCR ∥ VLM, and profile-aware OCR fps (0.5 standard / 1 premium) — ~12–14× faster than the original. Workers claim jobs with SKIP LOCKED; stale locks self-reclaim.

Job queue + worker concurrency

Custom ~150-line layer on Postgres. Idempotency keys de-dupe; per-kind lock timeouts requeue crashed jobs; RequeueLater defers rate-limited dispatches to the next UTC day instead of failing. Worker runs N concurrent claim slots (default 3, tune via WORKER_CONCURRENCY) — SKIP LOCKED is the only mutex; the 10 generate_asset jobs per package finish ~3× faster.

🔌

Pluggable LLM providers

OpenAI · Anthropic · OpenRouter · Ollama · LM Studio · OpenClaw · Codex CLI. Quick Presets, Fetch Models, per-profile routing. API keys encrypted at rest + never serialized to the client.

Dispatch

Editorial → DojoClaw §8.2 (/api/v1/articles/from-brief). Social → Zernio §9.3 (per-account platforms[], ChannelHelm metadata, X threads, signed clip URLs §9.4, 24-h rate limit §9.6). YouTube → direct upload via per-brand OAuth (defaults to private).

Shorts editor

Per-Short workspace: word-snap trim, captions, animated ASS subtitle styles, live overlay preview, dual-crop, active-speaker reframe, b-roll compositing, redaction, and per-clip publish options. The short_clip_plan is the editable truth; clip_render UPSERTs the build output and bumps render_rev.

Auto-render

When a clip plan lands, generate_asset fans out one clip_render per clip — so you open the Shorts tab to ready-to-preview MP4s, not an empty "click Render" list. Renders still need separate approval before dispatch.

🗄

Storage lifecycle

Inline Stage-1 cleanup removes pipeline scratch the moment its last consumer is done. An archive_package worker moves published media to ARCHIVE_ROOT after N days. Postgres stays the permanent audit trail. Full map →

Webhooks

HMAC-signed, fail-closed. Correlate by external id or metadata. Duplicate deliveries swallowed by a unique (source, source_event_id) index.

🔒

Local-first + secure

Your media never leaves your Mac unless you publish. Dashboard pages use local sessions and brand-scoped roles; worker/API paths keep bearer-token auth. MEDIA_ROOT traversal guards, optional signed /media/* URLs, and package/source brand consistency remain enforced.

🌐

Marketing + legal

Static pages in public/ — landing (index.html), Privacy, Terms, Contact, Impressum — styled in the same identity, ready for any web host that serves index.html.

When you load a video

From drop to dispatched, in five moves.

Each step shows the worker that runs it, the file/DB state it produces, and what the operator sees.

Video flow swimlanes Browser Workers + ML External 1 · Drop / paste POST /api/uploads or URL 3 · Review in Studio /packages/[id] ingest audio.wav scene_cuts transcribe MLX Whisper transcript.json analyze_visual mlx-vlm + OCR frame_index fuse scene_log (windows) analyze_intelligence LLM analysis topics · hooks generate_asset ×N (per type) ready_for_review dispatch on approval → external LLM provider analyze + generate DojoClaw / Zernio /articles · /posts webhooks → published
Solid arrows = job hand-offs in the queue. Dashed arrows = external calls (LLM/Zernio/DojoClaw + webhooks back).

You drop a file or paste a URL

For a YouTube link, the brand is auto-detected from the channel (yt-dlp metadata → match by youtube_channel_id → by website domain → else auto-create). For a file, the browser streams the body to /api/uploads behind a same-origin guard with a hard size cap. Either way a sources row, a packages row, and an ingest job are created, and you land on /packages/[id].

created: sources, packages · enqueued: ingest:{sourceId}

The pipeline runs in the background

ingest downloads or probes the file, extracts audio.wav, detects scene cuts. transcribe_audio + analyze_visual run in parallel via uv run python ml/*. fuse stitches their outputs into a scene log. analyze_intelligence calls the LLM for topics/hooks/retention, then fans out one generate_asset per type (titles/description/tags/chapters/linkedin/x/article…) and a thumbnail_concepts job that generates AI thumbnail images (when an image provider is configured) — or falls back to frame extraction at hook timestamps.

package: draft → ingested → fused → analyzed → ready_for_review

You review in the Studio

The Console layout shows the platform rail, the video player, a live 4-layer pipeline indicator, and an asset stack: Titles (scored 0–100), Description (with chapters + hashtags), Tags (scored pills), Transcript, Thumbnails. Each card has Copy / Regenerate; empty sections offer Generate (works straight from the transcript). Right pane: approval.

actions: selectTitle · saveAssetPayload · regenerateAsset · generateSection · generateThumbnails

You approve → dispatch fires

Approving an asset enqueues dispatch:{assetId}. Approving the whole package approves every non-plan asset and enqueues a dispatch each; *_plan assets enqueue one clip_render per clip (plans never dispatch). Dispatch routes by type: DojoClaw for articles, Zernio for social/clips (per-account platforms + ChannelHelm metadata + signed media URLs + 24-h rate limit), and direct upload for YouTube (per-brand OAuth, default private). Daily limit hit → RequeueLater to next UTC midnight.

package: ready_for_review → approved → dispatching → dispatched / partially_dispatched

Webhooks close the loop

Signed callbacks land at /api/webhooks/{zernio,dojoclaw} (HMAC-verified, fail-closed). The processor correlates by external id or metadata, flips the asset to published / failed, records WordPress URL / post URL, and writes analytics into signals. The package status follows.

asset: dispatched → published / failed · signals: impressions, engagement, ctr
The four-layer pipeline

What "understanding the video" actually means.

Each layer produces a concrete artifact that feeds the next. The same indicator runs across the dashboard — compact rows on the package list, an expanded card in the Studio.

Four-layer pipeline Audio MLX Whisper large-v3 → transcript.json (word-level timing) Visual mlx-vlm + Apple Vision → frame_index descriptions + OCR Fusion pure TS (composeSceneLog) → scene_log timestamped windows Intelligence LLM (your provider) → analysis topics · hooks · retention
Audio + Visual run in parallel; Fusion merges them; Intelligence is the brief everything else is drafted from.
Thumbnails

Generated, not just grabbed.

The thumbnail_concepts worker now generates AI thumbnail images. The LLM turns the package analysis into N distinct visual concepts + ≤4-word headlines, your image provider renders each, and ffmpeg produces two variants per concept for you to pick from. With no image provider configured it falls back to the original frame-extraction at hook timestamps — so it still works zero-config. Audio-only profiles skip thumbnails entirely.

Thumbnail generation flow FROM package analysis topics · hooks LLM N visual concepts distinct scenes + ≤4-word headlines 2 standard · 3 premium IMAGE PROVIDER Runware / Flux-Z renders each concept → download to disk (MEDIA_ROOT/thumbs) FFMPEG · per concept two variants plain image headline overlay (drawtext) STUDIO operator picks one variant No image provider? → ffmpeg frame-extract at hook timestamps (zero-config fallback) · audio-only profiles skip thumbnails
Primary path is AI image generation (Runware / Flux-Z-Image) → plain + headline variants per concept. Dashed = the frame-extraction fallback when no image provider is set.
Where every asset goes

One package, three destinations.

After review and approval, the dispatch worker routes each asset by type. Editorial goes to your local DojoClaw; social posts and rendered clips go to Zernio's cloud; YouTube uploads go direct via per-brand OAuth. *_plan assets never dispatch — they feed clip_render.

Asset and dispatch routing The package (after generate_asset) YouTube metadata title · description · tags · chapters Thumbnails AI image · plain + headline variants Social posts linkedin · x · instagram · … Editorial article_brief · blog_draft short_clip_plan ✎ editable truth — NEVER dispatched → clip_render → rendered_short_clip rendered_short_clip ▶ build output · signed media URL dispatch routes by type on approval rate-limit aware The router Destinations YouTube (direct) per-brand OAuth · default private video + metadata + thumbnail Zernio (cloud) 15 social destinations · §9.3 posts.create · X threads signed clip URLs · 24-h rate limit DojoClaw (local LAN) articles/from-brief · §8.2 m4max.local:8788 Webhooks back → published / failed · signals
Blue paths = dispatchable now (rendered clips, social). Grey = editorial / metadata. The short_clip_plan never leaves the building — it renders into a rendered_short_clip, which does. Green = webhooks closing the loop to published.
The Shorts editor

Cut a vertical clip without leaving the app.

Each clip in a short_clip_plan opens a full editor: scrub the timeline, trim to whole words, restyle subtitles live, write a description, and publish — all before a single frame is re-rendered. The plan is the source of truth; the render is a disposable build output.

Shorts editor flow Live preview (9:16) word ▮ highlight SubtitleOverlay.tsx no render needed Timeline (word-snap) snaps trim to whole words Editable plan fields title · caption subtitle style ×6 animations description (auto-written ✨) tags · description links publish options → Zernio Source of truth short_clip_plan ✎ debounced auto-save clips[i] = editable clip_render ffmpeg + ASS subs UPSERT · render_rev++ rendered_short_clip ▶ build output dispatchable
Operator edits the plan; clip_render rebuilds the MP4 (bumping render_rev, reusing the asset id so publish history stays bound). The live overlay means most styling iterations need zero renders. Shorts editor guide →
Package lifecycle

Twelve well-defined states.

A package moves through these in a strict order (workers advance it; the dashboard renders these pills everywhere). Two terminals after dispatch: full success or partial.

draft ingested transcribing analyzing_visual fused analyzed ready_for_review approved dispatching dispatched partially_dispatched failed
Status enums match contract §10 exactly. Assets use §2.2 plus the documented internal marker ready_for_review.
Go deeper

The rest of the documentation.

Each of these is its own diagrammed deep-dive into one part of the system.

📖

Handbook

The full operator manual — every screen, asset type, and workflow.

Setup guide

Install on your Mac: Postgres, workers, Python ML CLIs, launchd.

Shorts editor

Word-snap trim, 6 subtitle animations, live preview, per-clip publish.

🔌

LLM routing

Per-purpose provider routing, presets, at-rest key encryption.

YouTube publishing

Direct upload, per-brand OAuth, privacy defaults, metadata mapping.

🆎

A/B testing

Self-run title/thumbnail rotation, YouTube Analytics winner, feedback loop.

Worker concurrency

The SKIP LOCKED queue and N concurrent claim slots.

Visual optimization

Scene-cut VLM sampling — ~10–14× faster visual phase.

🗄

Storage lifecycle

What's temporary, archivable, and permanent — plus cleanup options.

YouTube readiness

How a package is judged ready to ship to YouTube.

Try it.

The whole thing runs on your Mac. Drop a video and watch the cards fill in as the pipeline completes.

▶ Open the dashboard