The Deepgram SDK implements ITextToSpeechClient and ISpeechToTextClient from Microsoft.Extensions.AI.
Supported Interfaces
Interface
Support Level
ITextToSpeechClient
Full (Aura-2 TTS, audio bytes, streamed response chunks)
ISpeechToTextClient
Full (URL-based pre-recorded transcription, language/model selection)
ITextToSpeechClient
Generate speech through the standard MEAI interface:
1 2 3 4 5 6 7 8 91011121314151617
usingDeepgram;usingMicrosoft.Extensions.AI;usingvarclient=newDeepgramClient(apiKey);ITextToSpeechClientttsClient=client;varresponse=awaitttsClient.GetAudioAsync("Deepgram Aura-2 is available through Microsoft.Extensions.AI.",newTextToSpeechOptions{ModelId="aura-2-asteria-en",AudioFormat="mp3",Speed=1.05f,});varaudio=response.Contents.OfType<DataContent>().Single();File.WriteAllBytes("deepgram.mp3",audio.Data.ToArray());
ModelId selects the Deepgram Aura voice model. VoiceId is also accepted as a fallback for MEAI callers that model TTS providers around voices instead of model IDs.
Provider-specific query parameters are available through DeepgramTextToSpeechPropertyNames:
1 2 3 4 5 6 7 8 9101112
varresponse=awaitttsClient.GetAudioAsync("Generate a WAV file with a fixed sample rate.",newTextToSpeechOptions{ModelId="aura-2-andromeda-en",AudioFormat="wav",AdditionalProperties=new(){[DeepgramTextToSpeechPropertyNames.SampleRate]=24000,[DeepgramTextToSpeechPropertyNames.MipOptOut]=true,},});
Use GetStreamingAudioAsync when callers want chunks as they are read from the HTTP response:
1 2 3 4 5 6 7 8 910111213
awaitforeach(varupdateinttsClient.GetStreamingAudioAsync("Read Deepgram audio chunks through MEAI.",newTextToSpeechOptions{ModelId="aura-2-asteria-en",AudioFormat="mp3",})){foreach(varchunkinupdate.Contents.OfType<DataContent>()){Console.WriteLine($"{update.Kind}: {chunk.Data.Length} bytes");}}
ISpeechToTextClient
Installation
1
dotnetaddpackagetryAGI.Deepgram
URL-Based Transcription
Deepgram's pre-recorded API works with audio URLs. Use RawRepresentationFactory to provide the audio URL:
1 2 3 4 5 6 7 8 9101112131415161718
usingDeepgram;usingMicrosoft.Extensions.AI;usingvarclient=newDeepgramClient(apiKey);ISpeechToTextClientsttClient=client;usingvardummyStream=newMemoryStream();// Stream is not used for URL-based transcriptionvarresponse=awaitsttClient.GetTextAsync(dummyStream,newSpeechToTextOptions{ModelId="nova-3",RawRepresentationFactory=_=>newListenV1RequestUrl{Url="https://dpgr.am/spacewalk.wav",},});Console.WriteLine(response.Text);Console.WriteLine($"Duration: {response.EndTime}");
Transcription with Language Hint
Specify a language code for more accurate transcription:
Unlike most other STT SDKs, Deepgram provides true real-time streaming via GetStreamingTextAsync. It uses the DeepgramListenV1RealtimeClient WebSocket API to send audio frames and receive incremental transcription updates in real-time.
TextUpdating updates contain interim (non-final) transcriptions that may change as more audio is processed
TextUpdated updates contain final transcriptions for completed speech segments
SessionOpen and SessionClose events bracket the streaming session
Each streaming update also carries a RawRepresentation with the provider-specific event data:
1 2 3 4 5 6 7 8 9101112131415161718192021222324
ISpeechToTextClientsttClient=client;usingvaraudioStream=File.OpenRead("audio.wav");awaitforeach(varupdateinsttClient.GetStreamingTextAsync(audioStream)){Console.WriteLine($"[{update.Kind}] {update.Text}");// Access the provider-specific event on each updateif(update.RawRepresentationisDeepgram.Realtime.ListenV1ResultsEventresults){Console.WriteLine($" Final: {results.IsFinal}, Start: {results.Start}s, "+$"Duration: {results.Duration}s");// Access word-level timestamps and confidenceif(results.Channel?.Alternativesis{Count:>0}alts){foreach(varwordinalts[0].Words){Console.WriteLine($" [{word.Start:F2}s - {word.End:F2}s] {word.Word} "+$"(confidence: {word.Confidence:P0})");}}}}
This provides genuine incremental results, unlike other providers that delegate streaming to their batch APIs.
Accessing the Underlying Client
Retrieve the DeepgramClient from the MEAI interface:
1234567
ISpeechToTextClientsttClient=client;varmetadata=sttClient.GetService<SpeechToTextClientMetadata>();Console.WriteLine($"Provider: {metadata?.ProviderName}");// "deepgram"vardeepgramClient=sttClient.GetService<DeepgramClient>();// Use deepgramClient for Deepgram-specific APIs (TTS, text analysis, etc.)