Skip to content

Microsoft.Extensions.AI Integration

Cross-SDK comparison

See the centralized MEAI documentation for feature matrices and comparisons across all tryAGI SDKs.

The Deepgram SDK implements ISpeechToTextClient from Microsoft.Extensions.AI.

Supported Interfaces

Interface Support Level
ISpeechToTextClient Full (URL-based pre-recorded transcription, language/model selection)

ISpeechToTextClient

Installation

1
dotnet add package tryAGI.Deepgram

URL-Based Transcription

Deepgram's pre-recorded API works with audio URLs. Use RawRepresentationFactory to provide the audio URL:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
using Deepgram;
using Microsoft.Extensions.AI;

using var client = new DeepgramClient(apiKey);
ISpeechToTextClient sttClient = client;

using var dummyStream = new MemoryStream(); // Stream is not used for URL-based transcription
var response = await sttClient.GetTextAsync(dummyStream, new SpeechToTextOptions
{
    ModelId = "nova-3",
    RawRepresentationFactory = _ => new ListenV1RequestUrl
    {
        Url = "https://dpgr.am/spacewalk.wav",
    },
});

Console.WriteLine(response.Text);
Console.WriteLine($"Duration: {response.EndTime}");

Transcription with Language Hint

Specify a language code for more accurate transcription:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
ISpeechToTextClient sttClient = client;

using var dummyStream = new MemoryStream();
var response = await sttClient.GetTextAsync(dummyStream, new SpeechToTextOptions
{
    SpeechLanguage = "es",
    ModelId = "nova-3",
    RawRepresentationFactory = _ => new ListenV1RequestUrl
    {
        Url = "https://example.com/spanish-audio.wav",
    },
});

Console.WriteLine(response.Text);

Accessing the Raw Response

The full Deepgram response is available via RawRepresentation:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
var response = await sttClient.GetTextAsync(dummyStream, new SpeechToTextOptions
{
    RawRepresentationFactory = _ => new ListenV1RequestUrl
    {
        Url = "https://dpgr.am/spacewalk.wav",
    },
});

var raw = (ListenV1Response)response.RawRepresentation!;
Console.WriteLine($"Request ID: {raw.Metadata.RequestId}");
Console.WriteLine($"Duration: {raw.Metadata.Duration}s");
Console.WriteLine($"Channels: {raw.Metadata.Channels}");

Streaming with Raw Response Access

Unlike most other STT SDKs, Deepgram provides true real-time streaming via GetStreamingTextAsync. It uses the DeepgramListenV1RealtimeClient WebSocket API to send audio frames and receive incremental transcription updates in real-time.

  • TextUpdating updates contain interim (non-final) transcriptions that may change as more audio is processed
  • TextUpdated updates contain final transcriptions for completed speech segments
  • SessionOpen and SessionClose events bracket the streaming session

Each streaming update also carries a RawRepresentation with the provider-specific event data:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
ISpeechToTextClient sttClient = client;

using var audioStream = File.OpenRead("audio.wav");
await foreach (var update in sttClient.GetStreamingTextAsync(audioStream))
{
    Console.WriteLine($"[{update.Kind}] {update.Text}");

    // Access the provider-specific event on each update
    if (update.RawRepresentation is Deepgram.Realtime.ListenV1ResultsEvent results)
    {
        Console.WriteLine($"  Final: {results.IsFinal}, Start: {results.Start}s, " +
            $"Duration: {results.Duration}s");

        // Access word-level timestamps and confidence
        if (results.Channel?.Alternatives is { Count: > 0 } alts)
        {
            foreach (var word in alts[0].Words)
            {
                Console.WriteLine($"    [{word.Start:F2}s - {word.End:F2}s] {word.Word} " +
                    $"(confidence: {word.Confidence:P0})");
            }
        }
    }
}

This provides genuine incremental results, unlike other providers that delegate streaming to their batch APIs.

Accessing the Underlying Client

Retrieve the DeepgramClient from the MEAI interface:

1
2
3
4
5
6
7
ISpeechToTextClient sttClient = client;

var metadata = sttClient.GetService<SpeechToTextClientMetadata>();
Console.WriteLine($"Provider: {metadata?.ProviderName}"); // "deepgram"

var deepgramClient = sttClient.GetService<DeepgramClient>();
// Use deepgramClient for Deepgram-specific APIs (TTS, text analysis, etc.)