Roadmap detail · ChannelHelm

v1.5 — Signal & Intelligence (shipped)

Close the Helm Signal loop.

Stop generating-and-forgetting: measure what performs and feed it back into generation. A/B routing was the first slice; these are the rest.

🖼️

Thumbnail feedback loop

✓ Shipped~ hours

What it is: Extends the A/B work to thumbnails. Today, when a title experiment decides, the winner is written to voice_examples so future titles drift toward what won. Thumbnails currently record nothing.
Why it matters: The thumbnail is half the click decision. Without this, thumbnail experiments tell you a winner but don't make the next video's thumbnails any better.
How it'd work: On a decided thumbnail (or title+thumbnail) experiment, capture the winning concept's visual_prompt + style traits and inject them as positive guidance into prompts/thumbnail_image.v1.md context for subsequent packages (a per-brand "what wins" exemplar set, mirroring voice examples).
Effort: Small — the experiment, decision, and feedback plumbing already exist; this adds a thumbnail-side writer + prompt wiring.
Unlocks: Thumbnails that compound the same way titles now do — the other half of the packaging loop.

📈

Sentiment-over-time curves

✓ Shipped~ low

What it is: An emotion curve across the video, derived from the fused scene log (the timestamped text + visual descriptions ChannelHelm already produces) — no new model inference.
Why it matters: The best Shorts come from emotional spikes, not arbitrary timestamps. An explicit curve makes "where's the energy" a first-class signal.
How it'd work: A light pass (lexicon or a single cheap LLM call) over each scene-log window scores valence/arousal; the curve is stored on the package. The clip planner prefers high-arousal windows; the Studio shows an emotion sparkline.
Effort: Low — reuses data already on disk; no new ML dependency.
Unlocks: Better-selected Shorts moments and an at-a-glance emotional map of every video.

🎯

Retention calibration model

✓ Shipped~ days

What it is: Replaces the current LLM-only retention guess with a small model calibrated against your channel's real retention curves.
Why it matters: Hook scoring and clip selection lean on predicted retention. Grounding those predictions in actual audience behavior makes every downstream choice sharper — and it improves as more videos accrue.
How it'd work: Use the YouTube Analytics scope (just added for A/B) to pull per-video retention + average-view-percentage into signals, accumulate, then fit a lightweight calibration that corrects the LLM's predicted-retention score toward measured truth.
Effort: Days — needs a data-accumulation window plus a modeling pass. Highest long-term payoff.
Unlocks: Retention scores you can trust; a flywheel that gets more accurate with every upload.

⚙️

Per-provider concurrency limits

✓ Shipped~ quick

What it is: A max_concurrent cap per row in llm_providers, enforced by the worker pool.
Why it matters: The queue runs N slots against whatever provider a job resolves to. A rate-limited provider (OpenAI/OpenRouter tiers) gets hammered and 429s when you raise WORKER_CONCURRENCY.
How it'd work: Add the column to the schema + /providers editor; the provider resolver holds a per-provider semaphore so in-flight requests never exceed the cap, independent of total worker slots.
Effort: Quick — one column, one semaphore.
Unlocks: Safely cranking worker concurrency without tripping provider limits.

v2 — Scale & Identity

Bigger structural moves.

For when single-operator throughput is no longer the constraint. Larger efforts, taken on when they unblock real volume.

📲

YouTube Direct for Shorts

v2~ medium

What it is: Upload Shorts per-clip via the YouTube Data API directly, instead of only routing them through Zernio.
Why it matters: Native uploads give finer control (privacy, scheduling, metadata) and remove a dependency for the YouTube destination specifically.
How it'd work: The dispatch worker fires two dispatches per rendered Short — YouTube Direct for the YouTube target, Zernio for TikTok/Instagram — reusing the per-clip publish options.
Effort: Medium — dual-dispatch logic + quota considerations.
Unlocks: First-class YouTube Shorts publishing under your own OAuth.

🎬

B-roll insertion

✓ Shippeddone

What it is: Honours the existing b_roll_enabled flag by compositing b-roll into rendered clips.
Why it matters: B-roll lifts retention on talking-head clips, and the existing UI flag now has a real rendered output path.
How it works: The word-snap/b-roll planner resolves clip-local cutaway segments and clip_render composites them through the ffmpeg filter graph.
Effort: Shipped in the vNext batch.
Unlocks: Visually richer clips without manual editing.

🗄️

Object storage (S3 / R2)

v2~ medium

What it is: An optional cloud object-storage backend for media, beyond the local NAS / archive export.
Why it matters: Local storage is plenty for v1 throughput; at higher volume an offload tier keeps the master lean.
How it'd work: A storage adapter behind the existing path helpers; the archive worker targets the bucket; mediaUrlFor resolves signed URLs.
Effort: Medium — adapter + path-resolution changes.
Unlocks: Effectively unbounded media retention without local disk pressure.

🗣️

Speaker ID by name

v2~ large

What it is: Replace generic speaker_01 labels with named identification via a per-brand face/voice index.
Why it matters: Named speakers make transcripts, chapters, and clip captions far more useful for multi-person content.
How it'd work: A per-brand enrollment index keyed off the existing diarization output; match voice/face embeddings to known identities.
Effort: Large — plus storage and privacy considerations.
Unlocks: Named diarization across the whole pipeline.

🔎

GSC article signals

✓ Shippeddone

What it is: Pulls Google Search Console position + page metrics for DojoClaw-published articles into the signals table.
Why it matters: Completes cross-surface performance data — the editorial half of the Helm Signal loop, alongside YouTube/social.
How it works: Brands connect GSC through local OAuth; collect_signal combines DojoClaw article data with clicks, impressions, CTR, and position.
Effort: Shipped in the vNext batch.
Unlocks: Article generation that learns from search performance.

👥

Local users and brand roles

✓ Shippeddone

What it is: Local dashboard users, signed sessions, and brand-scoped owner/editor/reviewer memberships.
Why it matters: Reviewers and collaborators can participate without weakening worker/API bearer-token control paths.
How it works: Dashboard pages use session auth; protected server actions assert brand roles before approving, dispatching, or changing member state.
Effort: Shipped in the vNext batch.
Unlocks: More than one person running the command center while staying local-first.

Ideas — unscheduled

Worth doing eventually.

A themed backlog of candidates. Each is tagged grounded (scaffolding already exists in the codebase) or bet (a new product direction). ✅ shipped marks items already built straight from this backlog (extended-network generation, long-clip planning, pinned comments, the unified performance dashboard, DojoClaw article signals, comment mining, brand-voice bootstrap, multi-language subtitles).

Reach multipliers — more output from one video

Idea	What it is	Type	Effort
Generate for the 8 un-wired networks	✅ Shipped. Per-network post generation for Facebook, Pinterest, Bluesky, Threads, Reddit, Telegram, Discord & Google Business — gated by the brand's connected Zernio accounts so nothing un-shippable gets drafted.	✅ shipped	—
Long-clip planning	✅ Shipped. `long_clip_plan` generates horizontal highlight segments; the renderer produces `rendered_long_clip` and dispatch routes it to YouTube via Zernio.	✅ shipped	—
Multi-language subtitles	✅ Shipped. Translate a Short's subtitles to other languages → per-language SRT + ASS sidecar files, reusing the transcript + ASS pipeline. TTS dubbing and a burned-in per-language re-render are deferred.	✅ shipped	—
Quote cards / carousels	✅ Shipped (v1.8). See the v1.8 summary.	✅ shipped	—
Per-platform Short captions	Tailored caption + hashtags per destination. Deferred — captions belong to clips, so better built as a `short_clip_plan` enhancement than standalone asset types.	bet	M

Deeper feedback loop — extend Helm Signal

Idea	What it is	Type	Effort
Comment mining → content loop	✅ Shipped. Post-publish, on-demand: mine a video's top YouTube comments → `content_ideas` + `faq` assets, from the Studio's "Mine comments" panel. (`youtube_pinned_comment` already generates from the analysis; this makes follow-up content audience-driven.)	✅ shipped	—
Best-time-to-post	Learn per-platform optimal windows from the signals already collected; pre-fill the scheduler.	bet	S–M
Unified performance dashboard	✅ Shipped. A new `/performance` route — one cross-surface view of how dispatched/published assets performed (signals + A/B results).	✅ shipped	—
DojoClaw + GSC article signals	✅ Shipped. `collect_signal` combines DojoClaw article analytics with Search Console clicks, impressions, CTR, and position when a GSC connection and published article URL are available.	✅ shipped	—
Prompt-version A/B	✅ Shipped (v1.8). Reported winner on /performance; never auto-pinned.	✅ shipped	—

Quality & trust

Idea	What it is	Type	Effort
Prosodic analysis	✅ Shipped (v1.8). Pure-TS energy/emphasis pass at transcription time fills the scene log for real.	✅ shipped	—
Audio-event detection	Laughter / music / applause (YAMNet on the Neural Engine) — good for podcasts and a cheap music-presence signal.	grounded	M
Brand glossary	✅ Shipped (v1.8). Canonical spellings in transcripts + prompts.	✅ shipped	—
Fact-check / claim guard	✅ Shipped (v1.8). Per-claim verdicts + Studio badge.	✅ shipped	—
Music / copyright detection	Flag clips likely to carry copyrighted audio before non-YouTube syndication (worked through below).	bet	M

Operator & business

Idea	What it is	Type	Effort
Cost tracking & budgets	✅ Shipped (v1.7). Helm Ledger — see the Command Deck guide.	✅ shipped	—
Brand-voice bootstrap	✅ Shipped. `/brands/[id]/voice` seeds `voice_examples` from pasted samples or the brand's existing published assets, so voice is good from upload #1.	✅ shipped	—
Bulk / batch ingest	✅ Shipped (v1.7). /sources/new bulk panel — URLs or local folder, per-line outcomes.	✅ shipped	—
Auto-approve rules	✅ Shipped (v1.7). Per-brand thresholds with a `payload.auto_approved` audit trail.	✅ shipped	—

One idea, worked through in full — the depth each entry above reaches once it's picked up:

🎵

Music / copyright detection

Idea~ medium

What it is: Flag clips that likely contain copyrighted audio before they syndicate to TikTok / Instagram — and as early warning before a render + dispatch slot is spent (most useful on ingested third-party source like podcasts/webinars).
Why it's only an idea: It can only ever be a risk predictor, not a verdict. The authoritative judge is YouTube Content ID / each platform's fingerprinting, and none is queryable before publishing.
Accuracy ceiling: A fully-local build can detect music presence well (~90%) but not copyright status. Genuine identification needs an opt-in commercial fingerprint API (e.g. ACRCloud, Audible Magic) — accurate (~95%+ on catalog music) but an external-cloud dependency that breaks local-first, and still misses long-tail rights while over-flagging royalty-free.
Already covered: For the YouTube destination, YouTube's own pre-publish Checks run Content ID for free — so the value is narrow (non-YouTube syndication + early warning).
If built: Local music-presence flag by default (advisory "⚠ music at 0:12–0:34"), with an optional fingerprint provider for real identification. Always a risk score, never a green light.

What shipped, and what remains.

Close the Helm Signal loop.

Thumbnail feedback loop

Sentiment-over-time curves

Retention calibration model

Per-provider concurrency limits

Bigger structural moves.

YouTube Direct for Shorts

B-roll insertion

Object storage (S3 / R2)

Speaker ID by name

GSC article signals

Local users and brand roles

Worth doing eventually.

Reach multipliers — more output from one video

Deeper feedback loop — extend Helm Signal

Quality & trust

Operator & business

Music / copyright detection