File transcription
Drop in audio or video files, transcribe them in batches, export TXT or SRT.
File transcription processes pre-recorded files — interviews, podcasts, meetings, voice memos — instead of dictating live. Same transcription engine and AI Enhancement options, applied to a queue.
Requires a paid license. The window opens on the free tier but is read-only — action buttons are disabled and a banner explains why. See Settings → License.
Supported formats
| Type | Extensions |
|---|---|
| Audio | .mp3, .wav, .m4a, .flac, .aac, .aiff |
| Video | .mp4, .mov |
For video, audio tracks are extracted automatically.
Adding files
Open File Transcription from the menu bar or Settings → File Transcription. Then either:
- Drag files from Finder onto the drop zone, or
- Click Browse Files.
File Transcription window with a drag-and-drop zone and a queue of three pending files
Invalid files (unsupported format, corrupted) are rejected with an error.
Processing
Each file moves through:
- Pending — queued.
- Extracting audio — video only.
- Chunking — VAD finds natural speech segments in long files.
- Transcribing — converting each chunk.
- Enhancing — optional AI cleanup.
- Completed.
For recordings over ~30 seconds, Whiskers chunks at natural pauses rather than fixed-clock splits — sentence boundaries stay intact.
Configuration
Top of the window:
- Model — local (Parakeet/Whisper) or cloud (Groq/OpenAI/Deepgram/Google). Local is private and free; cloud is often faster for long files.
- Language — explicit for best accuracy, or auto-detect for mixed-language audio.
Exporting
| Format | Extension | Available when |
|---|---|---|
| Plain text | .txt | Always |
| Subtitles | .srt | Whisper local, Groq (timed), Deepgram (timed), OpenAI whisper-1 |
Parakeet, Gemini, and OpenAI's gpt-4o-transcribe variants don't return per-segment timestamps, so SRT isn't available for those.
Expanded transcript card showing the transcript text, audio waveform, and an Export dropdown with TXT and SRT options
Export single files from their expanded card, or select multiple completed files and Export All in one go.
AI transformations
After a file finishes, you can apply prompts to its transcript without re-transcribing — meeting notes → bullet summary, draft → script, etc. Revert restores the original.
Same prompt library as live AI Enhancement. Custom prompts you've created are available here too.
Persistence
The queue and its transcripts survive app restarts and Mac reboots. Files stay until you delete them.
Tips
- Trim long silence from start/end if you can — chunking is more reliable.
- Multiple speakers work fine, but diarization isn't supported.
- For multilingual files, set the language explicitly per file.
- Cloud models are usually faster on long files; local models keep audio private.