The complete operator manual: install, run, configure, ship. Eighteen chapters that read top-to-bottom or jump-to-the-bit-you-need.
ChannelHelm is your video-to-publishing command center. You drop a video in one end; you get a complete publishing kit out the other — title, description, chapters, tags, social posts, article brief, thumbnails. Then you approve what you want to ship and it goes to YouTube, LinkedIn, X, and the other platforms you've connected.
Think of ChannelHelm as the place where one video becomes everything it needs to become. Before ChannelHelm, that workflow lived across ten browser tabs, three apps, and an hour of copy-paste. After ChannelHelm, it's one page, a few clicks, and the same content goes out everywhere you publish — drafted in your voice, scored for hook strength, ready to review.
On your Mac. ChannelHelm is local-first: the dashboard runs at http://localhost:3000, the database is a Postgres on your machine, your video files never leave your hard drive unless you publish them. The only external services ChannelHelm talks to are the ones you connect (an LLM provider, optionally Zernio for social, optionally DojoClaw for editorial, your YouTube channel via OAuth).
Each chapter assumes you've read the previous one, but you can jump in anywhere. The Glossary at the end defines every ChannelHelm-specific term in one place.
The 5-minute path from "nothing installed" to "I just shipped a video." If you read one chapter, read this one. Each step links to the deeper coverage later in the handbook.
/brands/new. A brand maps 1-to-1 to a YouTube channel and is the unit of multi-channel publishing. See Chapter 5 · Brands./providers. Pick a preset (Anthropic Claude is the easiest first choice), paste your API key, save. See Chapter 10 · LLM providers.ChannelHelm needs a Mac with Apple Silicon (M1 or newer), ~16 GB of RAM, and Homebrew. The whole install takes about 10 minutes including downloads.
MEDIA_ROOT until you delete it.# 1. Install Homebrew if you haven't already /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)" # 2. Install the runtime stack brew install node@22 pnpm postgresql@16 ffmpeg yt-dlp uv # 3. Start Postgres + create the database brew services start postgresql@16 createdb channelhelm # 4. Install Python deps for the ML CLIs cd ml && uv sync && cd .. # 4b. Headline thumbnails (drawtext) + burned-in captions (ass) need libfreetype # + libass. If `ffmpeg -filters | grep drawtext` is empty, use the full build: brew install ffmpeg-full && brew unlink ffmpeg && brew link --force --overwrite ffmpeg-full # 5. Install the app + apply migrations pnpm install cp .env.example .env # open .env, set DATABASE_URL and MEDIA_ROOT (absolute paths) pnpm db:migrate # 6. Run web + workers together pnpm dev:all
Open http://localhost:3000. You should see the empty dashboard.
You should see two banners in your terminal output:
▶ web : http://localhost:3000▶ worker: ingest,transcribe_audio,… (concurrency=3)And shortly after: [settings] subscribed to chs_settings and [runner] started lockedBy=…. If you see those, both the web server and the worker fleet are alive.
| Tool | Why ChannelHelm needs it |
|---|---|
node@22 + pnpm | Runtime + package manager for the Next.js app and the worker fleet. |
postgresql@16 | Single Postgres instance holds everything: brands, packages, assets, jobs, settings, providers, OAuth tokens. No Redis, no separate queue server. |
ffmpeg | Audio extraction, scene-cut detection, frame sampling, clip rendering. Headline-overlay thumbnails (drawtext) and burned-in Shorts captions (ass) need a build with libfreetype + libass — install ffmpeg-full if your ffmpeg lacks them. |
yt-dlp | Pulls YouTube videos when you paste a URL into the dashboard. |
uv | Python venv/runtime manager for the four ML CLIs in ml/ (Whisper, mlx-vlm, OCR, diarization). |
/settings page later shows "Settings table not migrated yet", you missed pnpm db:migrate. Run it and reload.When you open ChannelHelm, you land on the dashboard at /. This chapter is a guided tour of every element you see.
Each row is a video you've dropped into ChannelHelm. Click a row to open the Studio for that video.
Each row shows:
| Pill | What it means |
|---|---|
| Analyzing visual | Pipeline is running. Other "in-progress" states: Ingested · Transcribing · Fused · Analyzed. |
| Ready | Pipeline complete; all asset cards filled; awaiting your review & approval. |
| Dispatched | You approved & the dispatch worker shipped everything to its target. |
| Partial | Some assets dispatched, some failed. Usually means missing config (e.g. no Zernio account for a social network). |
A brand is your publishing identity for one YouTube channel. Everything else in ChannelHelm — videos, generated assets, dispatched posts — is brand-scoped. Multi-brand is root: no cross-brand reads outside admin views.
If you run multiple YouTube channels (your main channel, a side project, a client's channel), each gets its own brand. The brand carries:
Top nav → + New → Brand (or open /brands/new).
One brand = one publishing identity.
After creation you land on /brands/[id]. The page has three sections:
This is a per-platform map: { x: "acc_…", linkedin: "acc_…", tiktok: "acc_…" }. You only need to fill in the platforms you plan to publish to via Zernio. If you don't use Zernio at all, leave it empty — the dispatch worker will skip those platforms and mark the social assets as "no account configured" (which is harmless, just shows in the panel).
MEDIA_ROOT/your-brand-name/src_…/original.mp4. Renaming the slug after videos are ingested is supported — the "Normalize slug" button moves the folder and rewrites every stored path — but it's easier to pick a clean slug at creation time.Two ways: drop a local file, or paste a YouTube URL. Both create a Source row and a Package row, then enqueue the Pipeline. Within seconds, the Studio page for the new package opens.
/sources/new)..mp4, .mov, .webm, .m4v, or .mkv file onto the upload zone. Files stream to MEDIA_ROOT as they upload.youtube.com/watch?v=…, youtu.be/…, even Shorts URLs.MEDIA_ROOT; the pipeline starts.Each package runs under one of four profiles. The brand has a default; you can override per package on upload.
| Profile | What runs | Best for |
|---|---|---|
transcription_only | Audio transcription only. No visual phase, no diarization, no thumbnails — the cheapest tier. | Re-mining old material via Backlog Revival; bulk transcribing a back catalogue when you only need text. |
fast_audio_only | Audio + intelligence + full asset kit. Skips visual entirely (but still drafts titles, descriptions, social posts, etc.). | Podcasts, webinars, audio interviews. Fast (~1 min for a 30-min episode). |
standard_audio_visual | Full 4-layer pipeline at default model tiers. | Your YouTube uploads. The default for almost everyone. |
premium_multimodal | Same layers, larger VLM (32B), denser OCR (1 fps). | High-stakes content where descriptions matter (course modules, marquee episodes). |
transcription_only is the bare-minimum re-mine — it exists mainly so Backlog Revival can cheaply re-run an old video through today's prompts. fast_audio_only is the everyday audio profile for content you're actively publishing.The default cap on /api/uploads body size is 2 GB (MAX_UPLOAD_BYTES=2000000000 in your settings). Larger? Bump the setting on /settings.
When you drop a video, four layers run in the background: audio, visual, fusion, intelligence. Each layer produces a concrete artifact the next consumes. Plain-English: ChannelHelm listens to the video, watches it, combines what it learned, then thinks about what to make of it.
Listens to the video. Extracts the audio track with ffmpeg, then runs MLX Whisper large-v3 (Apple Silicon optimised) to produce a word-level transcript with timestamps. If you've set HF_TOKEN and accepted the pyannote license, it also labels speakers.
Output: transcript.json — array of word entries with start/end times, plus a flat text field.
Watches the video. Samples frames (dense for OCR, sparse for the VLM), runs Apple Vision OCR on each, runs mlx-vlm (Qwen2.5-VL by default) over the sparse keyframes for image descriptions, merges into a single index.
Output: frame_index.json — per-second entries with {timestamp, description, on_screen_text}.
Performance: this used to be the slowest layer (~10–15 min for an 8-min video). After the optimisation pass, it's ~65 s. See Chapter 15.
Combines what it learned. Pure-TypeScript step. Stitches the transcript + frame index into a "scene log" — windows of time, each carrying the spoken text + visual descriptions + on-screen text for that interval.
Output: scene_log on packages.intelligence — array of windows, each describing what happens during that span.
Thinks about it. Single LLM call. The model reads the scene log and produces a brief: list of topics, candidate hooks, retention notes, structural beats.
Output: analysis on packages.intelligence. Triggers the fan-out to generate-asset jobs.
The analyze_intelligence handler enqueues ~12 generate_asset jobs (one per asset type — title, description, chapters, tags, LinkedIn, X, article brief, short_clip_plan, etc.) plus thumbnail_concepts. Each generate_asset is one LLM call. With worker concurrency at 3, these all finish in ~35 s. The package moves to ready_for_review.
When a generate_asset job writes a *_plan asset (e.g. short_clip_plan or long_clip_plan), it fans out one clip_render job per clip automatically — so the MP4s are usually rendered by the time you open the Shorts/Clips tab (short plans → vertical, long plans → horizontal). Plans themselves are never dispatched; only the rendered_* outputs are. See Chapter 09 · The Shorts editor.
youtube_pinned_comment) generate for every package.Thumbnails are AI-generated images, not stills pulled from the video. After intelligence, the thumbnail_concepts worker turns the package analysis into visual concepts and renders each one with your configured image provider:
/providers with category image (Runware is the first / default type). Without one, ChannelHelm falls back to extracting still frames at the high-retention hook timestamps — the original behaviour, so you always get something with zero extra config or cost.transcription_only and fast_audio_only produce no thumbnails.PROVIDER_SECRET_KEY).When you open a package (/packages/[id]) you're in the Studio. It's your editor for a single video: review, edit, regenerate, approve, dispatch. This is where the operator-time is spent.
Three columns on a desktop:
From left to right:
Shows live status of all 4 layers. Each layer's status text updates as it progresses:
Running layers render in teal. Done layers in grey (lighter). Pending layers in dim grey.
Each card represents one generated asset. The shape depends on the platform tab you're viewing. The YouTube tab shows:
While the pipeline is still running, asset cards that haven't been filled in yet show a quiet teal pulsing indicator: "titles — generates automatically when analysis completes". There's no button to click — these will fill in by themselves. Once the pipeline is done, if any asset is still genuinely missing (a generate_asset job failed for some reason), a "Generate" button appears as a manual recovery path.
Three layout modes for the Studio (toggle in the bottom-right):
| Layout | Best for |
|---|---|
| Console (default) | The 3-column layout described above. Best for end-to-end review and approval. |
| Editor | Side-by-side compare view. Best when you're heavily editing copy and want context from the transcript + scene log. |
| Atlas | All platforms at once. Best for "what does this video look like across every channel?" |
See Chapter 14 · Approve & dispatch for the full walkthrough.
The left rail has a dedicated ✂ Shorts tab. Instead of a flat list of plan + rendered assets, it collapses to one row per clip — each row carrying the clip's editable metadata and a thumbnail of the rendered MP4. Click any row to open the per-Short editor. That whole editor gets its own chapter next.
Vertical short-form is where ChannelHelm earns its keep. The Shorts editor at /packages/[id]/shorts/[clipIndex] is a full per-clip studio: trim on a word-snapped timeline, restyle burned-in subtitles with a live preview, auto-draft a per-clip description, and publish each clip on its own schedule. This chapter explains the one mental model that makes all of it click.
There are two assets per clip and they have very different jobs:
| Asset | Role |
|---|---|
short_clip_plan | Editable source of truth. Holds every operator decision — title, description, tags, trim window, subtitle styling, description links, publish options — under payload.clips[clipIndex]. You edit this. |
rendered_short_clip | Build output. The actual vertical MP4 with burned-in subtitles. The clip_render worker produces it and copies the plan's editorial fields onto it. You never edit this directly — your edits would be lost on the next re-render. |
short_clip_plan only. The renderer rebuilds rendered_short_clip from the plan on every render and overwrites its editorial fields — anything written straight to the rendered asset is discarded.The style panel offers six burned-in caption animations. The live overlay previews each; clip_render emits an ASS subtitle file that ffmpeg burns into the MP4.
saveClipEdits() (debounced). Edits are appended to payload.edits_log[] for audit.renderClip() bumps the clip's render_rev, sets pending_render, and enqueues a clip_render job keyed clip_render:<plan>:<i>:rev<n>.rendered_short_clip keyed by (plan_asset_id, clip_index) — re-renders update the same asset id, so any dispatch/publish history stays attached. The render_rev guard skips a re-encode when the rev hasn't moved (idempotent crash-recovery).You usually don't click Render at all for the first pass. When the pipeline finishes and the generate_asset worker writes a short_clip_plan, it fans out one clip_render job per clip automatically. By the time you open the Shorts tab, the rendered MP4s are usually already there. You only click Render after you've changed something — a new trim, a different subtitle style.
generateClipDescription() once: a bounded, text-only LLM call (≤ 280 chars) over the clip's transcript window, in your brand voice. It stamps description_generated_at so it never retries on subsequent opens — you can always hand-edit afterward.Trim handles never cut mid-word. The snap runs client-side in the Timeline (so you see the snapped position as you drag) and server-side defensively in clip_render before ffmpeg's -ss (so an LLM-picked or stale trim is corrected at build time too). Both share src/lib/word-snap.ts.
Packages ingested before auto-render existed won't have rendered clips. The backfill script enqueues clip_render for every plan with missing renders:
# every package, only the clips that are missing a render tsx scripts/render-shorts.ts # scope to one package or brand tsx scripts/render-shorts.ts --package-id pkg_xxx tsx scripts/render-shorts.ts --brand-id brd_xxx # preview without enqueuing tsx scripts/render-shorts.ts --dry-run
long_clip_plan → rendered_long_clip. Everything in this chapter applies — the only differences are dimensions and the output asset type.ChannelHelm uses an LLM for two things: pipeline analysis (one call per package) and asset generation (one call per asset type, ~10 per package). You configure which provider serves which call at /providers. Pluggable: OpenAI, Anthropic, OpenRouter, Ollama, LM Studio, OpenClaw, Codex CLI.
| Class | What it is | Examples |
|---|---|---|
openai-compatible | Speaks the OpenAI HTTP shape (POST /v1/chat/completions). Works with any provider that emulates this — most cloud + local LLM stacks do. | OpenAI · OpenRouter · Ollama · LM Studio · vLLM · OpenClaw |
anthropic | Native Claude Messages API. Best instruction following and prose quality at the premium tier. | Claude Opus / Sonnet / Haiku |
codex-cli | Spawns the local codex CLI subprocess. No HTTP, no API key — uses your ChatGPT subscription OAuth. | Codex (ChatGPT subscription) |
all for a single-provider setup, or to a specific profile to dedicate that provider to one tier.Every LLM call passes through complete({ profile, ... }). The selection rule:
purpose=standard_audio_visual serves standard_audio_visual packages.premium_multimodal-tagged provider won't serve standard_audio_visual calls.API keys are encrypted at rest using AES-256-GCM via your PROVIDER_SECRET_KEY (set in /settings). Keys are NEVER serialized back to the browser. When you edit a provider, the API key field shows the placeholder "•••••••• saved" — leaving it blank preserves the saved key; typing replaces it.
The same /providers page also holds image providers — rows with category image (Runware is the first / default type). These power AI thumbnail generation: the pipeline turns the package analysis into visual concepts and the image provider renders them. Add one the same way you add an LLM provider; keys are encrypted at rest identically. If you don't configure one, ChannelHelm falls back to ffmpeg frame extraction for thumbnails — see Chapter 7 · Thumbnail generation.
| Setup | Setup time | Best for |
|---|---|---|
| All Anthropic | ~2 min | One row, purpose=all, Sonnet 4.6. Best default for "I don't want to think about it". |
| Tiered Anthropic | ~5 min | Haiku → fast, Sonnet → standard, Opus → premium. Quality scales with package profile. |
| Local + cloud | ~15 min | LM Studio (Qwen3) for batch pipeline + Anthropic Haiku for Studio actions. Cheapest steady state. |
The /settings page holds every runtime-configurable knob. Changes propagate live to every running process (the Next.js server + the worker fleet) via Postgres pg_notify — no restart needed. Below: every key, what it does, when to change it.
Settings live in two places:
settings table — runtime-editable. Edits via /settings persist here and override .env at runtime. Changes propagate to every running process within the LISTEN round-trip (~50 ms).| Key | What it does | When to set |
|---|---|---|
ZERNIO_API_KEY | Authenticates outbound POSTs to Zernio (LATE). | Before publishing to LinkedIn/X/etc. via Zernio. |
ZERNIO_WEBHOOK_SECRET | HMAC secret for inbound webhooks from Zernio. | Only when exposing /api/webhooks/zernio publicly via Cloudflare Tunnel. Local-only setups don't need this. |
DOJOCLAW_API_URL | LAN URL for DojoClaw (article publishing). | If you use DojoClaw for blog posts. |
DOJOCLAW_API_KEY | Authenticates outbound POSTs to DojoClaw. | Same as above. |
DOJOCLAW_WEBHOOK_SECRET | HMAC secret for inbound webhooks from DojoClaw. | Same as Zernio — only if exposing publicly. |
HF_TOKEN | HuggingFace token for the pyannote speaker-diarization model. | Optional — only if you want speaker labels in transcripts. Token must have the pyannote/speaker-diarization-3.1 license accepted on HF. |
| Key | What it does | When to set |
|---|---|---|
CLOUDFLARE_TUNNEL_HOSTNAME | Public base URL for webhooks + signed /media/* URLs. | Only if you've set up a Cloudflare Tunnel. |
MEDIA_URL_SECRET | HMAC key for signing /media/* URLs. | When using MEDIA_REQUIRE_SIGNATURE=1. |
MEDIA_REQUIRE_SIGNATURE | If 1, /media/* requires a signed URL. Default 0. | Before exposing the tunnel publicly. |
ALLOW_UNSIGNED_WEBHOOKS | If 1, webhook receiver accepts requests with no signature. Default 0. | Local smoke tests only. Never set to 1 on a publicly reachable host. |
MAX_UPLOAD_BYTES | Hard cap on /api/uploads body size. Default 2 GB. | If you upload larger files. |
| Key | What it does | When to set |
|---|---|---|
GOOGLE_OAUTH_CLIENT_ID | OAuth 2.0 Client ID from Google Cloud Console. One client supports all brands. | Once, per ChannelHelm instance. See Chapter 13. |
GOOGLE_OAUTH_CLIENT_SECRET | OAuth 2.0 Client secret paired with the ID above. Encrypted at rest. | Same time as the client ID. |
| Key | What it does | Default |
|---|---|---|
ARCHIVE_AFTER_DAYS | Days since the package's latest dispatch before it becomes eligible for the archive_package worker. Eligibility also requires archived_at IS NULL. | 14 |
ARCHIVE_DELETE_CLIPS | When true, the archive worker DELETES rendered clip MP4s instead of moving them to the archive. The source original.mp4 always moves (clip_render needs it for re-renders). | false |
KEEP_PIPELINE_ARTIFACTS | When 1, disables the inline Stage-1 cleanup in transcribe_audio / analyze_visual / fuse / analyze_intelligence — the debugging escape hatch that keeps frames_*/ and intermediate JSONs on disk. | unset |
Full detail in Chapter 16 · Storage & the file lifecycle. ARCHIVE_ROOT (the destination path) is boot-only — see the table below.
These can't change mid-flight without breaking running workers — they're shown read-only on /settings with an "edit .env and restart" hint.
| Key | Why boot-only |
|---|---|
DATABASE_URL | The Postgres pool initializes at boot — you can't swap the database connection from inside the app. |
MEDIA_ROOT | Workers spawn ffmpeg/ml subprocesses that resolve paths against this at startup. Mid-flight change would desync running jobs. |
ARCHIVE_ROOT | External-drive path for the post-publish archive worker. Unset = archiving disabled. Boot-only — workers cache the absolute path at startup and refuse to operate when it changes mid-run (e.g. drive unmounted). |
LOCAL_BEARER_TOKEN | API bearer token. Rotation invalidates every in-flight worker request. Use the Rotate bearer token button on /settings to mint a new value; paste into .env; restart. |
PROVIDER_SECRET_KEY | AES-256-GCM key that encrypts all sensitive setting values + provider API keys + OAuth refresh tokens. Rotating mid-flight locks every saved secret out until re-encryption. Set once at install, leave alone. |
/settings on a freshly migrated install, you may see a teal "Restart needed for live propagation" banner. That's because the worker boot hook (which opens the LISTEN connection) only runs at process start. Restart pnpm dev:all once and the banner clears.There are four ways to get a video from ChannelHelm onto YouTube. Each suits a different setup. Pick one.
| Path | Operator effort | Best for |
|---|---|---|
| Manual paste (default) | Copy 4 fields into YouTube Studio | First-time use; no setup beyond the ChannelHelm install. |
| YouTube Direct (Data API) ⭐ | One-time Google Cloud OAuth, then automated | Local-first setups. Recommended for most operators. |
| Zernio (LATE) → YouTube | Cloudflare Tunnel + Zernio account | Operators who also publish to LinkedIn / X / TikTok and want one unified pipeline. |
| ngrok + Zernio | One ngrok command (temporary) | Smoke-testing the Zernio flow without committing to Cloudflare. |
Approve the YouTube assets in the Studio. ChannelHelm marks them dispatched (manual) — that's an audit record meaning "operator confirmed they'll paste this manually." Open YouTube Studio, upload your original.mp4, paste each field. Done.
If you also want to record the resulting YouTube URL back into ChannelHelm (so the package header shows the red ▶ youtu.be/… chip), paste the URL into the "+ Paste YouTube URL" pill that appears next to the status pill.
The cleanest path for a local-first setup. After a one-time Google Cloud OAuth setup, approving a video automatically uploads it via the YouTube Data API v3. No third party in the loop except Google itself.
Full walkthrough: Chapter 13 · YouTube Direct setup.
Zernio is a third-party social publishing service that supports 15+ platforms including YouTube. If you're already paying for Zernio for LinkedIn/X/TikTok, you can also use it for YouTube. Setup requires a Cloudflare Tunnel so Zernio can download your video file.
ngrok exposes localhost:3000 via a temporary public URL. Use it to verify the Zernio → YouTube flow works before committing to a permanent Cloudflare Tunnel. Free tier limits: URL rotates every restart, 2-hour sessions.
The longest single chapter in the book. Worth the read because once it's done, every future video publishes with a single click. ~15 minutes including Google Cloud Console.
The whole flow works smoothest signed in as the Google account that owns your YouTube channel. Two recommended approaches:
accounts.google.com/Logout, then sign back in as your channel-owning account.ChannelHelm → Create.ChannelHelm; support email + dev contact: your emailyoutube.upload AND youtubeChannelHelm localhttp://localhost:3000/api/youtube/oauth/callback (exact)/settings on ChannelHelm.…apps.googleusercontent.com string → Save.GOCSPX-… string → Save (encrypted at rest immediately).Open /brands/[your-brand-id] in the same browser context that's signed in as the channel-owning Google account.
One-time consent on Google. Tokens encrypted at rest; never sent back to the browser.
login_hint for the chooser.original.mp4 to YouTube's resumable endpoint (~1–3 min depending on your upstream).myaccount.google.com/permissions).YouTube Data API free tier: 10,000 units/day. Each upload costs 1,600 units = ~6 uploads/day. Fine for a single creator. If you need more, request a quota bump in the Google Cloud Console.
The Approve & Publish panel on the right side of the Studio is where you decide what ships. This chapter is a guided tour of that panel.
Every asset moves through a small state machine. The pipeline fills it to ready_for_review; you approve it; the dispatch worker ships it; a publish webhook can confirm it live.
Appears at the top of the panel only when the brand has YouTube Direct configured. A dropdown with four options:
Choices commit immediately (no Save button). Saved on the package, so each video can have its own setting.
The dispatch worker reads the asset type and sends it to exactly one of three destinations (or records a manual audit entry). Editorial goes to DojoClaw on your LAN; social + rendered clips go to Zernio in the cloud; YouTube uploads go Direct via the Data API when the brand is configured for it.
publishAsset server action runs for each selected asset: flips status to approved, enqueues a dispatch:<assetId> job.article_brief → DojoClawlinkedin_post / x_post / x_thread / rendered_*_clip → Zernioyoutube_title_set (with brand on YouTube Direct) → Direct upload via Data APImanual (recorded as dispatched; operator handles posting)dispatched on success or failed with an inline error message on failure.dispatched (all good), partially_dispatched (some failed), or failed (all failed).Click the row to see the worker error inline. Most common causes:
acc_… id to the brand's zernio_accounts./settings./performance dashboard gives one cross-surface view of how dispatched/published assets actually did (collected signals + title/thumbnail A/B results); collect_signal pulls YouTube + Zernio metrics and now combines DojoClaw article analytics with Search Console clicks, impressions, CTR, and position when GSC is connected. After a video is live, Mine comments in the Studio turns its top YouTube comments into content_ideas + faq assets for the next upload. Seed a brand's voice from day one at /brands/[id]/voice (paste samples or pull existing published assets into voice_examples). In the Shorts editor you can translate a clip's subtitles into other languages — per-language SRT + ASS sidecar files (TTS dubbing and a burned-in per-language re-render are deferred).On a typical 8-minute video, ChannelHelm's pipeline completes in ~2–2.5 minutes. That includes everything from drop to "ready for review." This chapter explains the performance budget and how to tune it.
| Phase | Wall | What it is |
|---|---|---|
| ingest | ~12 s | Audio extract + scene-cut detection (ffmpeg) |
| transcribe_audio | ~45 s | MLX Whisper large-v3 over the audio track |
| analyze_visual | ~65 s | OCR ∥ VLM keyframe descriptions (max of the two) |
| fuse | ~1 s | TypeScript scene log composition |
| analyze_intelligence | ~10 s | One LLM call for topics/hooks/retention |
| generate_asset × 10 | ~35 s | 10 LLM calls running 3-way parallel |
| thumbnail_concepts | ~7–25 s | AI image generation (LLM concepts → image provider) — or ~7 s ffmpeg frame extraction when no image provider is set |
| Total | ~2–2.5 min | End-to-end on an 8-min video |
Default: 3 concurrent claim slots. Set via:
# in scripts/dev.sh or shell env WORKER_CONCURRENCY=3 # default # bump if generate_asset jobs queue up WORKER_CONCURRENCY=6 pnpm dev:all # drop if your LLM provider rate-limits you WORKER_CONCURRENCY=1 pnpm dev:all
Reasonable maxes per LLM provider:
| Provider | Reasonable concurrency |
|---|---|
| Anthropic (Claude) | 5–10 |
| OpenAI | 3–8 (depends on tier) |
| Codex CLI (local) | 2–4 (bounded by CPU) |
| LM Studio (local) | 1–2 (local server queues internally) |
| Profile | Pipeline wall | Quality |
|---|---|---|
transcription_only | <1 min (transcript only) | Just the transcript — no visual, no diarization, no thumbnails. The cheapest tier, built for Backlog Revival re-mines. |
fast_audio_only | ~1 min (skips visual) | Audio-only metadata + full asset kit. Fine for podcasts. |
standard_audio_visual | ~2–2.5 min | Full pipeline at sensible defaults. Best ratio. |
premium_multimodal | ~4–6 min | 32B VLM, dense OCR. Use for content where descriptions really matter. |
Every file ChannelHelm writes has exactly one lifecycle. Most are throwaway by the time the pipeline finishes; a few are permanent. Left alone, a typical 8-minute video would sit at ~85 MB on disk forever. Two automatic mechanisms — inline cleanup and the archive worker — plus two operator actions (Revive & Delete video) keep that lean. Here's the whole map.
An artifact's stage is set by its last legitimate consumer. After that read, it's either re-readable later (a later stage) or never touched again (delete).
| Artifact | Last consumer | Stage |
|---|---|---|
original.mp4 | clip_render (future re-renders) | 3 · post-publish |
audio.wav · frames_ocr/ · frames_vlm/ | the step that produced them (one read) | 1 · pipeline |
scene_log.json · frame_index.json | fuse / analyze_intelligence — also mirrored in Postgres | 1 · pipeline |
transcript.json | Shorts editor word-snap — also mirrored in Postgres | 2 · review |
thumbs/concept_*.jpg | thumbnail selection + dispatch upload | 2 · review |
clips/clip_NNN.mp4 · .ass | dispatch + Studio preview | 3 · post-publish |
| Postgres rows | everything — the audit trail | 4 · permanent |
Each pipeline worker deletes its single-consumer inputs the moment the next step starts. transcribe_audio drops audio.wav; analyze_visual drops the frame folders + intermediate JSONs; fuse and analyze_intelligence drop their Postgres-mirrored JSONs. No DB changes, no policy — a freshly ingested video averages ~40 MB instead of ~85 MB.
# keep everything on disk for debugging (e.g. ls the VLM frames) KEEP_PIPELINE_ARTIFACTS=1 pnpm dev:all
The archive_package worker runs on a schedule (enqueued by scripts/enqueue-recurring.ts, fired by launchd). For each package whose latest dispatch is older than ARCHIVE_AFTER_DAYS and that hasn't been archived yet, it moves original.mp4 (and clips/, unless ARCHIVE_DELETE_CLIPS=true) from MEDIA_ROOT to ARCHIVE_ROOT, then records the move so re-renders still resolve.
ARCHIVE_ROOT (boot-only, in .env) to your external-drive path. Unset = archiving disabled.ARCHIVE_AFTER_DAYS (default 14) AND packages.archived_at IS NULL.archive_package:<packageId> — at most one archive per package per cycle.clip_render reads sources.archive_path as a fallback when the local file is gone, so Shorts re-renders still work as long as the drive is mounted.Migration 0007 added the two columns this relies on: packages.archived_at and sources.archive_path.
scene_log.json, frame_index.json, transcript.json) are also mirrored into the packages.intelligence JSONB column. The disk copies are pure duplication — every downstream reader can use the Postgres version — which is exactly why inline cleanup can delete them safely.Two buttons in the Studio header act on a source's lifecycle directly. Both live next to Retry.
Re-runs the whole pipeline in place on an existing source using today's prompts. Use it when you've improved a prompt and want to refresh the kit on an old video, or to re-mine a back catalogue. By default it runs under the cheapest transcription_only profile.
generate_asset UPSERTs by (package, type), so regenerated assets overwrite the old ones in place — fresh kit, same package id, dispatch/publish history preserved.transcription_only; you can revive under a richer profile if you want the visual phase too.Permanently removes the source video from disk — both the local copy and the archived copy — to free space, while keeping all Postgres history (assets, dispatches, signals). This is the third storage-lifecycle option, completing the set: A inline cleanup, B the archive worker, C operator hard-delete.
sources.local_media_path and sources.archive_path.The nine problems you're most likely to hit and the exact fix for each.
Symptom: Red banner on /settings; saves fail.
Cause: pnpm db:migrate hasn't been run on this database.
Fix: pnpm db:migrate. Reload.
Symptom: Yellow banner on /settings right after first install or after a code update.
Cause: The worker boot hook (which opens the LISTEN connection) only runs at process start.
Fix: Ctrl-C pnpm dev:all, run it again. One-time per code update.
Symptom: Titles/Description/Tags cards show "Generate" buttons instead of content, but the pipeline indicator shows all 4 layers as done.
Cause: The generate_asset jobs for those types either failed or weren't enqueued.
Fix: Click the manual Generate button (it's the fallback recovery affordance), OR open /jobs to see which generate_asset jobs failed and why.
Symptom: Worker log shows ENOENT: no such file or directory, stat '…/your-brand/src_…/original.mp4'.
Cause: A brand slug rename moved the media folder but didn't rewrite all stored paths. Should be auto-fixed by the renorm action; if you hit this, the file probably lives at a different path.
Fix: Check MEDIA_ROOT/<brand-slug>/src_…/ on disk. If the file is there, the issue is a stale stored path — contact for a backfill script.
Symptom: YouTube Connect flow lands back on the brand page with a red error banner mentioning refresh_token.
Cause: Google only issues a refresh_token on the FIRST grant. If you've granted ChannelHelm access before, subsequent re-grants don't include one.
Fix: Visit myaccount.google.com/permissions as the channel-owning account, find ChannelHelm, click Remove access. Then click Connect on the brand page again.
Symptom: Worker log: The user has exceeded the number of videos they may upload.
Cause: YouTube Data API daily quota (10,000 units/day, 1,600 per upload = 6 uploads).
Fix: Wait until tomorrow UTC. To raise it: Google Cloud Console → APIs & Services → Quotas → request bump.
Symptom: Drop a video, package created, status stuck at draft or ingested indefinitely.
Cause: Worker process not running, OR worker is running but has no handler for the queued job's kind.
Fix: Check tail -f /tmp/channelhelm-dev.log for [runner] started lockedBy=…. If absent, restart pnpm dev:all. If present but no [ingest] lines after dropping a video, the queue lookup might be off — check /jobs for pending rows.
Symptom: Worker log: tokens to keep > context length (LM Studio) or similar from another provider.
Cause: Your model's context window is too small for the prompt + scene log + response. Common with Qwen3-32B loaded at 4096-token default.
Fix: For LM Studio: lms load <model> --context-length 16384. For cloud providers: switch to a longer-context model (most cloud models default to 200k+ tokens).
Symptom: thumbnail_concepts fails the headline overlay with No such filter: 'drawtext', or clip_render produces an MP4 with no burned-in captions.
Cause: Your ffmpeg was built without libfreetype (drawtext) and/or libass (ass/subtitles). A minimal Homebrew ffmpeg can lack both.
Fix: brew install ffmpeg-full && brew unlink ffmpeg && brew link --force --overwrite ffmpeg-full. Verify with ffmpeg -filters | grep -E 'drawtext|ass'. The plain AI thumbnail (no headline) and frame-extraction fallback work either way; only the overlay + caption burn-in need these filters.
Every ChannelHelm-specific term, defined once.
MEDIA_ROOT/<brand-slug>/src_…/.transcription_only · fast_audio_only · standard_audio_visual · premium_multimodal. Set on the brand as default; per-package override allowed.dispatches table for audit.dispatch:<assetId>) that prevents duplicate job rows. Postgres unique index enforces.payload.clips[i]. Operators edit this; the renderer reads it.rendered_long_clip.src/lib/word-snap.ts.*_plan, it fans out one clip_render job per clip so rendered MP4s exist before you open the Shorts tab. Backfill older packages with scripts/render-shorts.ts.packages.archived_at and sources.archive_path.experiment_tick worker rotates variants, reads each one's performance from the YouTube Analytics API, applies the winner, and feeds it into voice_examples. Needs the yt-analytics.readonly scope (reconnect pre-v1.5 brands). Full walkthrough: the A/B testing guide.transcription_only). Clears all the source's jobs so every stage re-runs; generate_asset UPSERTs so assets refresh without losing dispatch history. Requires the source media still on disk. Server action: reviveSource(packageId, profile?).deleteSourceVideo(packageId)./providers with category image (Runware today) used to generate AI thumbnail images. When none is configured, the thumbnail worker falls back to ffmpeg frame extraction. Keys encrypted at rest like LLM providers./packages/[id]. Where you review and approve.workers/runner.ts. Runner = the main loop in that process. Slot = one concurrent claim loop inside the runner (default 3 per worker)./api/v1/articles/from-brief. Optional.Seven companion docs go deeper than this handbook can: