There are currently 10 direct ISpeechToTextClient implementations in the tryAGI SDKs.
Capability Matrix
SDK
File transcription
URL transcription
GetStreamingTextAsync()
Translation
Timestamps
Notes
Reka
Y
Y
~
Y
Y
Stream or URL input; translation supported
ElevenLabs
Y
-
~
-
Y
Realtime WebSocket exists outside the current MEAI adapter
AssemblyAI
Y
Y*
~
-
Y
Upload + poll flow
Deepgram
-
Y
Y
-
Y
URL-based batch transcription plus true realtime WebSocket streaming
Gladia
Y
Y*
~
Y
Y
URL and translation via RawRepresentationFactory
Cartesia
Y
-
~
-
Y
Synchronous STT with word timestamps
FishAudio
Y
-
~
-
Y
Timestamped ASR plus speech-related tools
RevAI
Y
Y
~
-
Y
Upload or URL submission, then poll for completion
SarvamAI
Y
-
~
-
Y
22+ Indian languages with provider-specific metadata
Speechmatics
Y
-
~
-
Y
Batch transcription with polling
~ means the MEAI streaming method exists but returns the final transcript after processing completes, not true incremental recognition updates.
Y* means the URL path is exposed through RawRepresentationFactory, not a dedicated MEAI option.
Universal Usage
12345678
usingMicrosoft.Extensions.AI;ISpeechToTextClientclient=/* any provider */;awaitusingvaraudioStream=File.OpenRead("audio.mp3");varresponse=awaitclient.GetTextAsync(audioStream);Console.WriteLine(response.Text);
Deepgram is the only current adapter in this group with truly incremental MEAI streaming.
ElevenLabs, Reka, RevAI, SarvamAI, Speechmatics, FishAudio, AssemblyAI, Cartesia, and Gladia all expose GetStreamingTextAsync(), but the current adapters yield the final transcript after the provider finishes processing.
If you need provider-native realtime features such as interim hypotheses, diarization events, or richer WebSocket controls, use the provider SDK directly in addition to or instead of the MEAI abstraction.