latency · pillar 01
From last syllable
to first token,
in <150ms.
Multi-second lag is the #1 'AI tell' interviewers cite. Mirly runs the entire pipeline an order of magnitude faster than the incumbent — with documented budgets per stage, reproducible benchmarks, and raw data published.
Mirly delivers a first useful token in 127ms p50 and 189msp95, measured end-to-end from the last syllable of the interviewer’s question to the first pixel rendered on the candidate’s screen. Final Round AI on the same machine measured 1,810ms p50 — a 14× gap.
pipeline budget
Every millisecond, accounted for.
benchmark
5–14× faster than the rest.
Same machine, same audio source, same question. Warm p50 in milliseconds — last syllable to first visible token, frame-counted at 60fps.
methodology
Reproducible. Auditable.
Published benchmarks should be re-runnable by anyone with the same hardware. Ours are. Raw WAV, screen-recordings, and frame-count spreadsheet at github.com/mirly/latency-benchmark-2026.
- Hardware
- MacBook Air M2 · 16GB · macOS 14.5 · plugged in, single-app foreground
- Network
- Gigabit ethernet, London — deliberately worst-case for US-East vendors
- Audio source
- Pre-recorded 16kHz mono WAV, played into system audio via BlackHole
- Question
- "Tell me about a time you led a contentious technical decision" — same across all tools
- Metric
- Last syllable of question → first visible token, 60fps frame count
- Runs
- 10 per tool — 1 cold + 9 warm; p50 = median(warm)
- Date
- 2026-05-15 · vendor versions tabulated below
Full teardown: /blog/latency-teardown-6-copilots