Script import
Paste or type a script in a serif editor with generous spacing. An empty-state format guide shows the syntax inline so new users learn it without leaving the screen.
flexVox ships free with the complete script-to-audio workflow. Studio unlocks the professional edge — Auto-Cast, dialogue batching, background music, all export presets, AI script writing, cast versioning, and Shows.
A script-first workflow. The parser sees speakers, SFX, music, and scene tags from your raw paste — and tells you which lines it wasn't sure about.
Paste or type a script in a serif editor with generous spacing. An empty-state format guide shows the syntax inline so new users learn it without leaving the screen.
Recognizes HOST: dialogue, [Host] dialogue, (Host) dialogue, standalone name lines, [SFX], [Music], [SCENE], [CHAPTER], and inline modifiers like @underlay, volume=, loop, and influence=.
Every detected attribution has a confidence score. Low-confidence turns are highlighted so you only review what needs review.
Batch-assign unreviewed turns, merge duplicate speakers, and give each speaker a color badge for visual distinction.
Group turns under [SCENE: ...] headers and chapters with [CHAPTER:], [ACT:], or [PART:]. Move turns between scenes; deleting a scene preserves its turns.
Add new dialogue, SFX, or music turns anywhere — assigned to any speaker or scene from the review screen.
A keyboard accessory toolbar one-taps speaker prefixes, SFX/music tags, scene markers, and the full expression-tag menu organized by category.
Browse, audition, and assign distinct voices to every character — then tune them per-speaker.
Search and filter the ElevenLabs catalog by name, category, or type. Paginated, with inline play buttons on every row.
Stays open while you cast. Shows progress (e.g., "3 of 5 cast") and auto-advances to the next unvoiced speaker.
Pick voices for every unvoiced speaker in one tap. Analyzes each speaker's role to pick a best-match voice, leaves manual assignments untouched.
Sliders for stability, similarity boost, style exaggeration, and speed. Speaker boost toggle for supported models. v3 models get a simplified panel.
Inline play/stop on every voice row. Audition any voice instantly before assigning.
On any voice, "Find Similar Voices" analyzes the preview audio and returns related voices from the catalog.
Auto-snapshots before every generation. Manual snapshots, restore with one tap, prune to the latest 10.
Every audio asset remembers which voice produced it. Recasting a speaker shows a "RECAST" badge on every affected turn — and "Regenerate Changed" updates them all at once.
Per-project text aliases or phoneme overrides (IPA / Arpabet). Syncs to ElevenLabs before generation.
flexVox calls ElevenLabs to produce every line — and keeps going when individual turns fail.
Each speech turn uses its assigned voice with bidirectional context (previous and next request IDs) for natural voice continuity. Forced alignment runs automatically to produce word-level timing.
Batches up to 10 voices and 2,000 characters into a single multi-speaker call for more natural conversational flow. Larger scripts auto-split across batches.
Text-prompt SFX with duration, looping, and prompt-influence controls. Mark for seamless ambient loops via context menu or the loop modifier.
Text-prompt music up to 600 seconds. Optional instrumental-only flag.
Generate a track sized to your finished episode in a second pass — so the duration is always correct.
Continues when individual turns fail. Auto-retries network errors and rate limits with exponential backoff. Failed turns get a "FAILED" badge with error detail.
Cancel anytime — already-generated audio is preserved. Resume picks up where you left off, skipping completed turns.
Every generated file is checked for corruption, silence, and minimum duration. Issues get a warning badge in post-production.
No API key? flexVox returns silent placeholder audio with realistic durations so you can explore every screen and feature first.
Fix one line. Keep everything else. The whole point of working in a tool that thinks in turns.
Play each turn individually. Simple and Advanced view modes. Mini timeline with an interactive scrub bar at the top.
Swipe or context-menu any turn. The new take saves as an additional variant. Bidirectional voice continuity preserved.
Multiple takes per turn. Browse, play, pick the best, delete the rest.
Project-wide default pause plus per-turn custom pauses (0 to 10s). Toolbar button works in Simple and Advanced modes.
Mute any turn out of the final mix without deleting it. Excluded turns show with strikethrough and can be toggled back in.
Mark any music or SFX to play under dialogue with per-turn volume. Toggle from the context menu or write @underlay directly in the script.
Underlay and background music duck under dialogue and ramp back during pauses. Depth, attack, and release controls — plus LUFS-aware mode that picks ideal levels automatically.
Flag turns for later review from post-production or playback. Press F on a keyboard or tap the flag icon. Persists across sessions.
Edit a turn's dialogue text directly. Save and optionally regenerate immediately.
Hear it. Watch it. Send it.
Concatenate all active turns into M4A/AAC. Four tracks: dialogue, crossfade overlay, underlay (ducked), and background music. Peak normalization per track.
The Export tab is a live teleprompter. Active turn auto-scrolls into view; individual words highlight in real time when alignment data is available. Tap any line to seek there.
A compact play bar appears on the Script and Production tabs when audio is ready. Tap to jump to Export.
Audition the first ~2,000 characters of dialogue with assigned voices before committing to a full run. Segment timeline shows colored bars per speaker.
Spotify, Apple Podcasts, YouTube, Broadcast, or Custom LUFS (-30 to -6). Each preset has its own target loudness and use-case description.
SRT, VTT, JSON, and plain-text exports from the same sheet as audio export.
Define the cast once. Use it in every episode.
Create shows with a persistent cast, format, tone, and narrator mode that carry across episodes. Episodes track season and episode numbers.
Recurring and guest cast members with character profiles (role, age, bio, personality, speaking style) and optional voice assignments. New episodes inherit it all.
Define a show's segment structure (Intro, Main Topic, Q&A, Outro). Segments are reorderable with descriptions.
Intro music, outro music, and transitions from the Sound Library carry across every episode.
Long-press any project → "Create Show from This." Speakers, voice assignments, and character profiles carry forward to a new production.
Describe an episode. Get a production-ready draft. flexVox knows the v3 expression vocabulary.
Configure both providers; switch at generation time from the inline picker. Cost estimate updates live.
Pass per-speaker bio, role, personality, and speaking style as context so generated scripts produce distinct, consistent voices.
Upload text references (articles, research, outlines) as system context — used to inform without being copied verbatim.
The engine is trained on the full ElevenLabs v3 vocabulary — 62 tags across four categories. Stacks tags for complex emotions, builds emotional arcs, and can invent custom descriptive tags.
Add tags to existing dialogue using Apple Intelligence on-device or OpenAI/Claude fallback. Context-aware — surrounding turns inform suggestions. Review each before applying.
Full Cast, Single Narrator, or Narrator + Cast — affects both AI script generation and voice mapping.
Prefer a different AI? Copy the pre-formatted prompt to clipboard, paste into any tool, paste the result back.
Reusable SFX and music across every project.
Import M4A, MP3, WAV, or AIFF sound effects and music files into a global library that persists across projects.
Comma-separated keywords per asset. Search by name or tag, filter by category.
Pull library assets into SFX and music turns in post-production. The original library asset stays available for reuse.
A real iOS app. Built with SwiftUI and SwiftData. Keychain-secured. No web views.
Script, Production, Export — always accessible. Readiness indicators show what's needed next without locking you out.
Cmd+K opens a searchable palette across actions, projects, and shows. Arrow-key navigation; Enter to act.
Tab navigation (Cmd+1/2/3), playback (Space, arrows), regenerate (Cmd+R), mute (M), open voice mapping (Cmd+P), export (Cmd+E).
Contextual responses throughout — light taps for selections, medium impacts for state changes, slider ticks, and notification haptics for milestones and errors.
Ephemeral feedback messages — info, success, warning, error — with VoiceOver accessibility.
NavigationSplitView on iPad. Seven categories in three groups: Account, Defaults, System. iOS-style colored icons with status subtitles.
If the local SwiftData store gets corrupted on launch, the app attempts recovery in three tiers — normal open, delete-and-retry, in-memory fallback — and notifies you.
Free download. Free demo mode. Studio unlocks the rest when you're ready.