Dictation

Push-to-talk speech → text, offline and private. Wispr Flow, replaced.

LIVE

Typing is slow — and the fastest dictation tools are cloud-bound, sending your voice, your half-formed thoughts, and your jargon to someone else's server. Dictation makes speaking to your machine effortless and completely private: push to talk, get clean text in any app, fully on-device — a Wispr Flow you actually own.

A privacy-first voice-to-text tool. Push-to-talk only — no always-on mic. A local on-device speech-to-text engine transcribes, a vocabulary layer corrects your jargon, and the text auto-pastes into any app. It can read text back aloud, fully offline. Every dictation is saved to a personal repository.

🌱 Seed

Push-to-talk speech → text, fully on-device.

← shaped by typing is slow and cloud dictation leaks your words.

🛤 Path

Built a local on-device speech-to-text engine + push-to-talk hotkey + auto-paste into any app.

← shaped by privacy-first — no always-on mic, nothing leaves the machine.

🔀 Pivot

From raw transcription to a vocabulary corrector that learns your jargon (Voice AI, Text AI, 4M SAI…) so the words come out right.

← shaped by raw transcripts mangle domain terms; fidelity matters more than raw speed.

💎 Crystal

STT → vocabulary correction → auto-clipboard → saved to a personal repository, with offline read-aloud. A working Wispr-Flow replacement.

← shaped by a complete daily-driver, not a demo.

⭐ Principle

Speak naturally, get clean text anywhere, privately — routing correction depth by need.

← shaped by voice as the natural, private way in.

✓STT engine live-verified end-to-end, on-device
✓Push-to-talk hotkey + mic capture working
✓Vocabulary corrector (Voice AI, Text AI, 4M SAI…) verified
✓Auto-clipboard inject + session save working
✓Read-aloud (offline) implemented and tested

→Optional LLM cleanup pass (vocabulary-aware)
→Menu-bar app packaging for daily use
→Launch-at-login + standalone app distribution
→Seed vocabulary + style from past dictation history

★ the moonshot

Ensembled multi-model transcription with on-demand arbitration — routing error-correction depth by need: noise to the ensemble, jargon to vocabulary, long-form to a relay pool.

Imagine this working on your everyday tasks. The deepest how reveals itself when we build it together.

Build with me → See how it all fits — RARE