Start Dictating Yesterday

Those who have been around me in the past six months have noticed I spend a concerning amount of time with a tiny black clip on my collar, mumbling to my computer.
I need to convince you to join me. We've entered the absolute golden age of voice dictation, far past the inflection point of making sense for the everyman.
Three things have converged recently: the normalization of LLMs being a primary consumer of text input, the maturation of speech recognition models, and the commoditization of wireless lavalier mic hardware.
Here's how to stop being bottlenecked by our keyboards.
The Speed of Thought
LLMs have huge context windows and inhuman patience, so "give it as much context as you can" is standard advice.
Relatively fast speakers will hover around the 150 to 250 WPM range while relatively fast typists hover around 100 WPM. However, typing speed drops significantly under sustained production or cognitive load. The net effect is that the keyboard bottleneck causes most people to underproduce prompt input relative to the richness of their context. We get lazy because our hands can't keep up with our breath.
Dictation removes modality context switching between thinking and mechanical keystroke production. Speaking doesn't interrupt our thoughts because our cognition is biologically wired to enable production-based verbal reasoning.
Between faster raw output and lower interference, I find that I am able to provide context two to five times faster in dictated form.
Error Tolerance
Large language models are useful for more than just written language. Modern automatic speech recognition (ASR) models use bi-directional context for excellent accuracy on dictionary words. When we speak homophones or unclear dictation which could be ambiguously interpreted, words before or after can be used to disambiguate.
Moreover, our recipients are now primarily LLMs with their own layers of error tolerance. With a relatively simple system prompt adjustment, we can notify an LLM that its input was produced via ASR. In this way, homophone or near-dictionary error resolution can be inferred from context previously introduced in our chats or agentic coding sessions. For me, this means that the bar for a smooth feeling dictation system on the ASR side has arguably dropped from previously 99%+ to somewhere in the low 90s.
The availability of cheap, fast, and accurate remote transcription means we can use dictation in short multi-second bursts to augment typing or in a minutes-range stream of consciousness braindumps.
Lav Mic Revolution
Even ASR improvements and LLM robustness can't recover information from inadequate signal-to-noise!
Laptop mics are subject to a tremendous amount of environmental intereference. Earbuds or headphones typically have a mic at the ear, which is far away from our mouths and blocked by facial bone structure. Stationary mics such as goosenecks lock us into desktop usage at a particular static desk - this is unergonomic and not portable.
However, quality desktop wireless audio input hardware has traditionally been limited to professional systems with a belt pack. Luckily, through the 2020s, miniaturization and commoditization have brought us the one true microphone.
Wireless lavalier mics were originally designed for vlogging or content creation. Take a closer look at your favorite influencers' shirt the next time you watch a video. These transmitters now weigh 10 grams, run all day on a single charge, and plug directly into your laptop over USB-C. You can buy one for less than a hundred dollars!
Clip it on at home and forget about it; hold it next to your mouth in loud environments. I've dictated at work, transpacific flights, and city streets with construction. That tiny black clip is here to stay.
The current setup I recommend is the Rode Wireless Micro, a subscription to Wispr Flow dictation software, and a brief system prompt clause.
What are you waiting for? Start dictating yesterday.
