Which LLM for which task

Three provider classes

Everything plugs in the same way.

Each row in the llm_providers table is one of these three types. The provider abstraction (a single chat() method) is identical across all of them — DojoClaw-style.

openai-compatible

The HTTP shape POST /v1/chat/completions spoken by OpenAI, OpenRouter, Ollama, LM Studio, OpenClaw, vLLM, and most local stacks. Set a base URL + (optional) key + model name — done.

Cloud: OpenAI · OpenRouter
Local: LM Studio · Ollama · vLLM · OpenClaw
Best for: the bulk of the pipeline — fast iteration, cheap local runs, easy model swaps.

anthropic

Native Claude Messages API. Used directly (not via the OpenAI shape) so you get full feature parity with Sonnet/Opus — long context, thinking budgets where applicable, top-tier instruction following.

Models: Opus 4.7 · Sonnet 4.6 · Haiku 4.5
Best for: headline copy, article briefs, anything where prose quality matters most.
Premium tier: the obvious pick for premium_multimodal.

codex-cli

Spawns the local codex CLI as a subprocess (no HTTP, no API key). Useful as a free local fallback or when you want offline operation. Slower than HTTP — best for non-interactive jobs.

Model: whatever Codex CLI is configured for
Best for: background pipeline runs on a Mac with no cloud quota.
Avoid for: Studio Regenerate (the operator is waiting on it).

How the pick happens

One resolver, every call site.

Every LLM call — wherever it originates — goes through complete({ profile, … }) in workers/integrations/lm_studio.ts, which calls getProvider(profile) in workers/integrations/llm/get_provider.ts. The selection rule (#17, locked) is below — eligibility is strict, scoring is deterministic.

Where the calls come from

Six call sites, two execution contexts, one resolver. The background pipeline enqueues jobs; the Studio runs the LLM synchronously inside a Server Action (the documented carve-out in CLAUDE.md) so the operator isn't waiting on a worker to be alive.

Six call sites · one complete() · one getProvider() · three provider classes. The Studio's three sync Server Actions are the documented "LLM in a Server Action" carve-out — bounded, text-only, operator-waiting.

selectProvider is a pure, exported, unit-tested function. The env fallback (and one-time seed) is what made the "zero-config first run" work before you added any providers.

Practical implication: if you want a single model to handle everything, set its purpose to all. If you want a different model per tier, create three rows with purposes fast_audio_only / standard_audio_visual / premium_multimodal. A row tagged with one specific profile will only serve that profile — it won't quietly bleed into others.

Two categories, one table

Text providers for chat. Image providers for thumbnails.

The llm_providers table carries a category column: text (chat/LLM — everything above, the default) and image (text-to-image, for AI thumbnail generation). Same table, same /providers editor (a Text/Image toggle), encrypted keys at rest — but two disjoint resolvers that never cross. An image provider is never picked for chat; an LLM is never picked to paint a thumbnail.

Two resolvers over the same table, filtered by category. getImageProvider reuses the exact same selectProvider() scoring as the LLM path — only the candidate rows differ. is_default is scoped per category.

How image resolution works

getImageProvider(purpose) loads only category='image' rows, then runs the same selectProvider() used for LLMs: exact purpose match (3) → all (2) → is_default (1), tiebreak is_default DESC then id ASC. So a per-profile image provider works identically to a per-profile LLM. Returns null when no image row exists.

Resolver: workers/integrations/image/get_image_provider.ts
Scoring: the shared selectProvider() from the LLM path
Never crosses: getProvider filters category='text'; getImageProvider filters category='image'

runware

The first (and today only) image provider type. Text-to-image over Runware's HTTP REST API (fetch, no SDK), serving Flux / Z-Image models like runware:z-image@turbo. Add it at /providers with the Image-category Runware preset.

Base URL: https://api.runware.ai/v1
Model: runware:z-image@turbo (or runware:100@1, …)
Call site: the thumbnail_concepts worker — the only image-provider consumer

Optional by design: with no image provider configured, thumbnail_concepts falls back to frame extraction — stills pulled from the video at the best hook timestamps. Configure Runware only when you want AI-painted thumbnails. The LLM still writes the image-gen prompt either way (that's a category='text' call); the image provider only renders it.

Task × provider matrix

What each task actually needs.

All LLM calls go through complete() from one of a handful of entry points: analyze_intelligence (the pipeline brief), generate.ts::generateAssetContent (every asset + the Studio Regenerate / per-section Generate buttons), and generateClipDescription (the Shorts editor's per-clip description generator). The right column is what works well enough; the stars are what's worth paying for.

Task	Where it runs	Profile in	OpenAI-compatible (local)	OpenAI-compatible (cloud)	Anthropic	Codex CLI
analyze_intelligencetopics · hooks · retention from scene log	worker	package.profile	★★ Qwen3-32B / Llama-3.3-70B	★★ gpt-4o-mini · Sonnet-4.6 via OR	★★★ Claude Sonnet 4.6	◯ slow but works
generate_asset: youtube_title_set5 hook variants, scored 0–100	worker + Studio	package.profile	○ Qwen3-32B usable	★★★ gpt-4o · Sonnet	★★★ Claude Sonnet/Opus	— too slow for taste-driven copy
generate_asset: youtube_descriptionhook → body → chapters → CTA → hashtags	worker + Studio	package.profile	★★ Qwen3-32B	★★ gpt-4o-mini	★★★ Sonnet (best chapter timing)	◯ ok for batch backfill
generate_asset: youtube_tagsscored pills	worker + Studio	package.profile	★★ any local 7B+ works	★★ gpt-4o-mini	★★ Haiku 4.5 (fast + cheap)	◯ fine
generate_asset: linkedin_post · x_threadplatform-shaped copy	worker	package.profile	○ local can sound generic	★★ gpt-4o	★★★ Sonnet 4.6	— skip
generate_asset: article_brief→ DojoClaw	worker	package.profile	○ depends on local model	★★ gpt-4o · OR/Sonnet	★★★ Opus 4.7 if budget allows	◯ workable
generate_asset: short_clip_plan / long_clip_planstructured JSON — captions + cuts	worker	package.profile	★★ Qwen3-32B (great at JSON)	★★ gpt-4o-mini	★★ Haiku 4.5 (fast)	◯ ok
thumbnail_conceptsconcept + image-gen prompt (text) · render via image provider →	worker	package.profile	★★ Qwen3-32B	★★ gpt-4o-mini	★★★ Sonnet (best at visual ideation)	◯
Studio · Regeneratesingle asset, ~10–30s, operator waiting	server action	package.profile	○ only if local model is loaded + warm	★★★ gpt-4o (low latency)	★★★ Haiku 4.5 or Sonnet 4.6	— too slow, kills the UX
Studio · Generate (per-section)title / description / tags from transcript	server action	package.profile	○ same as Regenerate	★★★ gpt-4o-mini · Haiku	★★★ Haiku 4.5 (best latency/price)	— skip
Shorts · generateClipDescriptionper-clip post body, ≤280 chars, auto-fired	server action	package.profile	★★ short + bounded — local is fine	★★★ gpt-4o-mini (cheap, snappy)	★★★ Haiku 4.5 (best latency/price)	— too slow for an editor auto-fire

★★★ = the obvious pick · ★★ = solid · ○ = workable but not first choice · — = don't.

Pick one model per profile

Three tiers, three opinions.

Profiles are how you trade quality for cost and latency. Set the profile on the brand (default) or per package — every LLM call in that package then follows it.

Solid arrows: exact purpose match (score 3). Dashed: fallthrough to a purpose="all" / is_default row when no exact row exists. A row tagged with one profile never bleeds into another.

transcription_only

Cheapest tier — Backlog Revival re-mining

Audio-only, no visual phase at all. Built for re-mining old material under current prompts (Backlog Revival). Same shape as fast_audio_only today but kept distinct so the two can diverge. Pick the cheapest fast model.

Anthropic · Haiku 4.5 OpenAI · gpt-4o-mini LM Studio · Qwen3-8B/14B

fast_audio_only

Audio-only, batch volume, cost-sensitive

Podcasts, webinars, daily uploads where you just need decent titles + description + tags. No visual pipeline. Aim for cheap + fast over polish.

Anthropic · Haiku 4.5 OpenAI · gpt-4o-mini LM Studio · Qwen3-8B/14B

standard_audio_visual

The default for YouTube long-form

Full pipeline — audio + visual + fusion + intelligence + clip plans + thumbnails. You want this to feel like a thoughtful human draft. Latency: minutes per package is fine, seconds for Studio actions matters.

Anthropic · Sonnet 4.6 (recommended) OpenAI · gpt-4o LM Studio · Qwen3-32B at 16k+ ctx

premium_multimodal

Marquee episodes, deep multi-platform plans

Big interviews, paid sponsorships, the videos you want featured in every channel. Maximum reasoning + nuance per call; budget is not the constraint.

Anthropic · Opus 4.7 (recommended) OpenAI · gpt-4o (long ctx) LM Studio · Qwen3-32B (if offline-required)

all

One model for everything (simplest setup)

Skip the tiering. A single purpose="all" provider handles every pipeline run + every Studio action. Best while you're still finding what you like.

Anthropic · Sonnet 4.6 (best all-rounder) OpenAI · gpt-4o

Mixed setup pattern: a lot of operators run local for the pipeline, cloud for the Studio. Tag the local LM Studio row with standard_audio_visual (cheap batch drafts), and add a Haiku/Sonnet row tagged all — the Studio's profile="all" calls then go to Anthropic for snappy responses, while bulk pipeline jobs stay local.

Quick presets

Three setups, ranked by setup time.

All three are valid. Pick the one that matches what you've already paid for and how much fiddling you want to do.

① All Anthropic

~2 minutes

One row, purpose all, Sonnet 4.6, API key. Done. Pipeline + Studio all go to Claude. The best default if you don't want to think about it.

② Tiered Anthropic

~5 minutes

Three rows: Haiku → fast_audio_only, Sonnet → standard_audio_visual, Opus → premium_multimodal. Quality scales with package profile; cost stays in check.

③ Local + cloud

~15 minutes

LM Studio (Qwen3-32B at 16k ctx) for the pipeline, Anthropic Haiku for Studio actions. Cheapest steady state on a Mac Studio; needs LM Studio loaded with enough context.

Configure it.

The Providers settings page lets you add, test, and tag rows per purpose. API keys are encrypted at rest and never sent back to the browser.

⚙ Open /providers

Which LLM for which task.

Everything plugs in the same way.

openai-compatible

anthropic

codex-cli

One resolver, every call site.

Where the calls come from

Text providers for chat. Image providers for thumbnails.

How image resolution works

runware

What each task actually needs.

Three tiers, three opinions.

Cheapest tier — Backlog Revival re-mining

Audio-only, batch volume, cost-sensitive

The default for YouTube long-form

Marquee episodes, deep multi-platform plans

One model for everything (simplest setup)

Three setups, ranked by setup time.

① All Anthropic

② Tiered Anthropic

③ Local + cloud

Configure it.