Skip to content

AssemblyAI

Nuget package dotnet License: MIT Discord

AssemblyAI has an official .NET SDK available, and we use this SDK primarily to improve the overall experience of our SDK/try to reach/exceed the level of the official one and extend it to all our generated SDKs for other platforms.

Features 🔥

  • Fully generated C# SDK based on official AssemblyAI OpenAPI specification using OpenApiGenerator
  • Same day update to support new features
  • Updated and supported automatically if there are no breaking changes
  • All modern .NET features - nullability, trimming, NativeAOT, etc.
  • Support .Net Framework/.Net Standard 2.0
  • Microsoft.Extensions.AI ISpeechToTextClient support

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
using AssemblyAI;

using var api = new AssemblyAIClient(apiKey);

var fileUrl = "https://github.com/AssemblyAI-Community/audio-examples/raw/main/20230607_me_canadian_wildfires.mp3";

//// You can also transcribe a local file by passing in a file path
// var filePath = "./path/to/file.mp3";
// var uploadedFile = await client.Transcript.UploadFileAsync();
// fileUrl = uploadedFile.UploadUrl;

Transcript transcript = await client.Transcript.CreateTranscriptAsync(TranscriptParams.FromUrl(
    fileUrl,
    new TranscriptOptionalParams
    {
        LanguageDetection = true,
        SpeakerLabels = true, // Identify speakers in your audios
        AutoHighlights = true, // Identifying highlights in your audio
    }));

transcript.EnsureStatusCompleted();

Console.WriteLine(transcript);

Microsoft.Extensions.AI

The SDK implements ISpeechToTextClient:

1
2
3
4
5
6
7
8
9
using AssemblyAI;
using Microsoft.Extensions.AI;

ISpeechToTextClient speechClient = new AssemblyAIClient(apiKey);

await using var audioStream = File.OpenRead("recording.wav");
var response = await speechClient.GetTextAsync(audioStream);

Console.WriteLine(response.Text);

Transcribe

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
using var client = GetAuthenticatedApi();

var fileUrl = "https://github.com/AssemblyAI-Community/audio-examples/raw/main/20230607_me_canadian_wildfires.mp3";

// You can also transcribe a local file by passing in a file path
// var filePath = "./path/to/file.mp3";
// var uploadedFile = await client.Files.UploadAsync();
// fileUrl = uploadedFile.UploadUrl;

Transcript transcript = await client.Transcripts.SubmitAsync(TranscriptParams.FromUrl(
    fileUrl,
    new TranscriptOptionalParams
    {
        SpeechModels = [],
        LanguageDetection = true,
        SpeakerLabels = true, // Identify speakers in your audios
        AutoHighlights = true, // Identifying highlights in your audio
    }));

transcript.EnsureStatusCompleted();

Console.WriteLine(transcript);

Transcribe Live

Connect to the AssemblyAI v3 real-time streaming API for live speech-to-text transcription.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
var apiKey =
    Environment.GetEnvironmentVariable("ASSEMBLYAI_API_KEY") is { Length: > 0 } apiKeyValue
        ? apiKeyValue
        : throw new AssertInconclusiveException("ASSEMBLYAI_API_KEY environment variable is not found.");

// Create the realtime client and connect.
using var client = new AssemblyAIRealtimeClient();
await client.ConnectAsync(
    new Uri($"wss://streaming.assemblyai.com/v3/ws?api_key={apiKey}"));

// Receive the session started event.
using var cts = new CancellationTokenSource(TimeSpan.FromSeconds(30));
var receivedSessionBegins = false;

await foreach (var serverEvent in client.ReceiveUpdatesAsync(cts.Token))
{
    if (serverEvent.IsBegin)
    {
        receivedSessionBegins = true;
        Console.WriteLine($"Session started: {serverEvent.Begin?.Id}");
        Console.WriteLine($"Expires at: {serverEvent.Begin?.ExpiresAt}");

        // After session starts, send audio data.
        // In a real application, you would stream microphone PCM16 audio:
        // await client.SendAsync(audioBytes, WebSocketMessageType.Binary, true, cts.Token);

        // Update configuration for turn detection sensitivity.
        await client.SendUpdateConfigurationAsync(new UpdateConfigurationPayload
        {
            EndOfTurnConfidenceThreshold = 0.5,
            MaxTurnSilence = 2000,
        });

        // For this example, manually force an endpoint to get results.
        await client.SendForceEndpointAsync(new ForceEndpointPayload());
    }
    else if (serverEvent.IsTurn)
    {
        var turn = serverEvent.Turn!;
        Console.WriteLine($"Turn {turn.TurnOrder}: {turn.Transcript}");
        Console.WriteLine($"  End of turn: {turn.EndOfTurn}");
        Console.WriteLine($"  Confidence: {turn.EndOfTurnConfidence}");

        if (turn.EndOfTurn == true)
        {
            break;
        }
    }
    else if (serverEvent.IsTermination)
    {
        var termination = serverEvent.Termination!;
        Console.WriteLine($"Session terminated. Audio: {termination.AudioDurationSeconds}s, Session: {termination.SessionDurationSeconds}s");
        break;
    }
}

// Gracefully terminate the session.
await client.SendSessionTerminationAsync(new SessionTerminationPayload());

Speech To Text Client Get Text Async

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
using var client = GetAuthenticatedApi();
ISpeechToTextClient speechClient = client;

// Transcribe audio using the MEAI ISpeechToTextClient interface.
// The client uploads the audio stream and polls until transcription is complete.
using var httpClient = new HttpClient();
await using var audioStream = await httpClient.GetStreamAsync(
    "https://github.com/AssemblyAI-Community/audio-examples/raw/main/20230607_me_canadian_wildfires.mp3");

var ms = new MemoryStream();
await audioStream.CopyToAsync(ms);
ms.Position = 0;

var response = await speechClient.GetTextAsync(ms);

Console.WriteLine($"Text: {response.Text}");

Speech To Text Client Get Service Metadata

1
2
3
4
5
using var client = new AssemblyAIClient("dummy-key");
ISpeechToTextClient speechClient = client;

// Retrieve metadata about the speech-to-text provider.
var metadata = speechClient.GetService<SpeechToTextClientMetadata>();

Speech To Text Client Get Service Self

1
2
3
4
5
using var client = new AssemblyAIClient("dummy-key");
ISpeechToTextClient speechClient = client;

// Access the underlying AssemblyAIClient from the MEAI interface.
var self = speechClient.GetService<AssemblyAIClient>();

Support

Priority place for bugs: https://github.com/tryAGI/AssemblyAI/issues
Priority place for ideas and general questions: https://github.com/tryAGI/AssemblyAI/discussions
Discord: https://discord.gg/Ca2xhfBf3v

Acknowledgments

JetBrains logo

This project is supported by JetBrains through the Open Source Support Program.