Skip to content

Microsoft.Extensions.AI Integration

Cross-SDK comparison

See the centralized MEAI documentation for feature matrices and comparisons across all tryAGI SDKs.

The Deepgram SDK implements ITextToSpeechClient and ISpeechToTextClient from Microsoft.Extensions.AI.

Supported Interfaces

Interface Support Level
ITextToSpeechClient Full (Aura-2 TTS, audio bytes, streamed response chunks)
ISpeechToTextClient Full (URL-based pre-recorded transcription, language/model selection)

ITextToSpeechClient

Generate speech through the standard MEAI interface:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
using Deepgram;
using Microsoft.Extensions.AI;

using var client = new DeepgramClient(apiKey);
ITextToSpeechClient ttsClient = client;

var response = await ttsClient.GetAudioAsync(
    "Deepgram Aura-2 is available through Microsoft.Extensions.AI.",
    new TextToSpeechOptions
    {
        ModelId = "aura-2-asteria-en",
        AudioFormat = "mp3",
        Speed = 1.05f,
    });

var audio = response.Contents.OfType<DataContent>().Single();
File.WriteAllBytes("deepgram.mp3", audio.Data.ToArray());

ModelId selects the Deepgram Aura voice model. VoiceId is also accepted as a fallback for MEAI callers that model TTS providers around voices instead of model IDs.

Provider-specific query parameters are available through DeepgramTextToSpeechPropertyNames:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
var response = await ttsClient.GetAudioAsync(
    "Generate a WAV file with a fixed sample rate.",
    new TextToSpeechOptions
    {
        ModelId = "aura-2-andromeda-en",
        AudioFormat = "wav",
        AdditionalProperties = new()
        {
            [DeepgramTextToSpeechPropertyNames.SampleRate] = 24000,
            [DeepgramTextToSpeechPropertyNames.MipOptOut] = true,
        },
    });

Use GetStreamingAudioAsync when callers want chunks as they are read from the HTTP response:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
await foreach (var update in ttsClient.GetStreamingAudioAsync(
    "Read Deepgram audio chunks through MEAI.",
    new TextToSpeechOptions
    {
        ModelId = "aura-2-asteria-en",
        AudioFormat = "mp3",
    }))
{
    foreach (var chunk in update.Contents.OfType<DataContent>())
    {
        Console.WriteLine($"{update.Kind}: {chunk.Data.Length} bytes");
    }
}

ISpeechToTextClient

Installation

1
dotnet add package tryAGI.Deepgram

URL-Based Transcription

Deepgram's pre-recorded API works with audio URLs. Use RawRepresentationFactory to provide the audio URL:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
using Deepgram;
using Microsoft.Extensions.AI;

using var client = new DeepgramClient(apiKey);
ISpeechToTextClient sttClient = client;

using var dummyStream = new MemoryStream(); // Stream is not used for URL-based transcription
var response = await sttClient.GetTextAsync(dummyStream, new SpeechToTextOptions
{
    ModelId = "nova-3",
    RawRepresentationFactory = _ => new ListenV1RequestUrl
    {
        Url = "https://dpgr.am/spacewalk.wav",
    },
});

Console.WriteLine(response.Text);
Console.WriteLine($"Duration: {response.EndTime}");

Transcription with Language Hint

Specify a language code for more accurate transcription:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
ISpeechToTextClient sttClient = client;

using var dummyStream = new MemoryStream();
var response = await sttClient.GetTextAsync(dummyStream, new SpeechToTextOptions
{
    SpeechLanguage = "es",
    ModelId = "nova-3",
    RawRepresentationFactory = _ => new ListenV1RequestUrl
    {
        Url = "https://example.com/spanish-audio.wav",
    },
});

Console.WriteLine(response.Text);

Accessing the Raw Response

The full Deepgram response is available via RawRepresentation:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
var response = await sttClient.GetTextAsync(dummyStream, new SpeechToTextOptions
{
    RawRepresentationFactory = _ => new ListenV1RequestUrl
    {
        Url = "https://dpgr.am/spacewalk.wav",
    },
});

var raw = (ListenV1Response)response.RawRepresentation!;
Console.WriteLine($"Request ID: {raw.Metadata.RequestId}");
Console.WriteLine($"Duration: {raw.Metadata.Duration}s");
Console.WriteLine($"Channels: {raw.Metadata.Channels}");

Streaming with Raw Response Access

Unlike most other STT SDKs, Deepgram provides true real-time streaming via GetStreamingTextAsync. It uses the DeepgramListenV1RealtimeClient WebSocket API to send audio frames and receive incremental transcription updates in real-time.

  • TextUpdating updates contain interim (non-final) transcriptions that may change as more audio is processed
  • TextUpdated updates contain final transcriptions for completed speech segments
  • SessionOpen and SessionClose events bracket the streaming session

Each streaming update also carries a RawRepresentation with the provider-specific event data:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
ISpeechToTextClient sttClient = client;

using var audioStream = File.OpenRead("audio.wav");
await foreach (var update in sttClient.GetStreamingTextAsync(audioStream))
{
    Console.WriteLine($"[{update.Kind}] {update.Text}");

    // Access the provider-specific event on each update
    if (update.RawRepresentation is Deepgram.Realtime.ListenV1ResultsEvent results)
    {
        Console.WriteLine($"  Final: {results.IsFinal}, Start: {results.Start}s, " +
            $"Duration: {results.Duration}s");

        // Access word-level timestamps and confidence
        if (results.Channel?.Alternatives is { Count: > 0 } alts)
        {
            foreach (var word in alts[0].Words)
            {
                Console.WriteLine($"    [{word.Start:F2}s - {word.End:F2}s] {word.Word} " +
                    $"(confidence: {word.Confidence:P0})");
            }
        }
    }
}

This provides genuine incremental results, unlike other providers that delegate streaming to their batch APIs.

Accessing the Underlying Client

Retrieve the DeepgramClient from the MEAI interface:

1
2
3
4
5
6
7
ISpeechToTextClient sttClient = client;

var metadata = sttClient.GetService<SpeechToTextClientMetadata>();
Console.WriteLine($"Provider: {metadata?.ProviderName}"); // "deepgram"

var deepgramClient = sttClient.GetService<DeepgramClient>();
// Use deepgramClient for Deepgram-specific APIs (TTS, text analysis, etc.)