— Tools
Speech-to-Text
AI that transcribes spoken audio into written text — Whisper, AssemblyAI, Deepgram.
Also known as: ASR · Speech recognition · Transcription
What is Speech-to-Text?
Speech-to-text (STT, also called ASR — automatic speech recognition) converts spoken audio into written text. OpenAI Whisper (open-weight) is the dominant baseline; AssemblyAI, Deepgram, and Sarvam Saaras lead for commercial multilingual quality. Critical for meeting transcription, customer-call analytics, voice agents, and accessibility.
— Related
Terms connected to Speech-to-Text
Tools
Whisper
OpenAI's open-source speech-to-text model — the baseline for AI transcription in 2026.
Open →Tools
Text-to-Speech
AI that turns text into natural-sounding spoken audio — ElevenLabs, OpenAI TTS, Sarvam.
Open →Concepts
AI Agent
An AI system that decides its own next action and takes multi-step actions autonomously.
Open →From definitions to deployed projects.
Knowing what a term means is step one. ONROL's AI Generalist track gets you shipping projects that use it.
