AssemblyAI

AssemblyAI has an official .NET SDK available, and we use this SDK primarily to improve the overall experience of our SDK/try to reach/exceed the level of the official one and extend it to all our generated SDKs for other platforms.

Features 🔥

Fully generated C# SDK based on official AssemblyAI OpenAPI specification using OpenApiGenerator
Same day update to support new features
Updated and supported automatically if there are no breaking changes
All modern .NET features - nullability, trimming, NativeAOT, etc.
Support .Net Framework/.Net Standard 2.0
Microsoft.Extensions.AI ISpeechToTextClient support

Usage

using AssemblyAI;

using var api = new AssemblyAIClient(apiKey);

var fileUrl = "https://github.com/AssemblyAI-Community/audio-examples/raw/main/20230607_me_canadian_wildfires.mp3";

//// You can also transcribe a local file by passing in a file path
// var filePath = "./path/to/file.mp3";
// var uploadedFile = await api.Files.UploadAsync(await File.ReadAllBytesAsync(filePath));
// fileUrl = uploadedFile.UploadUrl;

Transcript transcript = await api.Transcripts.SubmitAsync(new TranscriptParams
{
    AudioUrl = fileUrl,
    SpeechModels = [],
    LanguageDetection = true,
    SpeakerLabels = true, // Identify speakers in your audios
    AutoHighlights = true, // Identifying highlights in your audio
});

transcript.EnsureStatusCompleted();

Console.WriteLine(transcript);

Microsoft.Extensions.AI

The SDK implements ISpeechToTextClient:

using AssemblyAI;
using Microsoft.Extensions.AI;

ISpeechToTextClient speechClient = new AssemblyAIClient(apiKey);

await using var audioStream = File.OpenRead("recording.wav");
var response = await speechClient.GetTextAsync(audioStream);

Console.WriteLine(response.Text);

Transcribe

using var client = GetAuthenticatedApi();

var fileUrl = "https://github.com/AssemblyAI-Community/audio-examples/raw/main/20230607_me_canadian_wildfires.mp3";

// You can also transcribe a local file by uploading bytes first:
// var apiKey = Environment.GetEnvironmentVariable("ASSEMBLYAI_API_KEY")!;
// var uploaded = await client.Files.UploadAsync(apiKey, await File.ReadAllBytesAsync("./path/to/file.mp3"));
// fileUrl = uploaded.UploadUrl!;

var queued = await client.Transcripts.SubmitAsync(new TranscriptParams
{
    AudioUrl = fileUrl,
    SpeechModels = [],
    LanguageDetection = true,
    SpeakerLabels = true,
    AutoHighlights = true,
});

// Submit returns immediately; poll Transcripts.GetAsync until the status is Completed (or Error).
var transcript = await PollUntilTerminalAsync(client, queued.Id);

transcript.EnsureStatusCompleted();

Transcribe Live

Connect to the AssemblyAI v3 real-time streaming API for live speech-to-text transcription.

var apiKey =
    Environment.GetEnvironmentVariable("ASSEMBLYAI_API_KEY") is { Length: > 0 } apiKeyValue
        ? apiKeyValue
        : throw new AssertInconclusiveException("ASSEMBLYAI_API_KEY environment variable is not found.");

// Create the realtime client and connect with API-key auth (Authorization header).
using var client = new AssemblyAIRealtimeClient();
await client.ConnectAsync(apiKey, new StreamingConnectOptions
{
    FormatTurns = true,
});

// Receive the session started event.
using var cts = new CancellationTokenSource(TimeSpan.FromSeconds(30));
var receivedSessionBegins = false;

await foreach (var serverEvent in client.ReceiveUpdatesAsync(cts.Token))
{
    if (serverEvent.IsBegin)
    {
        receivedSessionBegins = true;
        Console.WriteLine($"Session started: {serverEvent.Begin?.Id}");
        Console.WriteLine($"Expires at: {serverEvent.Begin?.ExpiresAt}");

        // After the session starts, send audio data.
        // In a real application, you would stream microphone PCM16 audio:
        // await client.SendAsync(new ArraySegment<byte>(audioBytes), WebSocketMessageType.Binary, true, cts.Token);

        // Update configuration for turn detection sensitivity.
        await client.SendUpdateConfigurationAsync(new UpdateConfigurationPayload
        {
            EndOfTurnConfidenceThreshold = 0.5,
            MaxTurnSilence = 2000,
        });

        // For this example, manually force an endpoint to get results.
        await client.SendForceEndpointAsync(new ForceEndpointPayload());
        break;
    }
}

// Gracefully terminate the session.
await client.SendSessionTerminationAsync(new SessionTerminationPayload());

Speech To Text Client Get Text Async

using var client = GetAuthenticatedApi();
ISpeechToTextClient speechClient = client;

// Transcribe audio using the MEAI ISpeechToTextClient interface.
// The client uploads the audio stream and polls until transcription is complete.
using var httpClient = new HttpClient();
await using var audioStream = await httpClient.GetStreamAsync(
    "https://github.com/AssemblyAI-Community/audio-examples/raw/main/20230607_me_canadian_wildfires.mp3");

var ms = new MemoryStream();
await audioStream.CopyToAsync(ms);
ms.Position = 0;

var response = await speechClient.GetTextAsync(ms);

Console.WriteLine($"Text: {response.Text}");

Speech To Text Client Get Service Metadata

using var client = new AssemblyAIClient("dummy-key");
ISpeechToTextClient speechClient = client;

// Retrieve metadata about the speech-to-text provider.
var metadata = speechClient.GetService<SpeechToTextClientMetadata>();

Speech To Text Client Get Service Self

using var client = new AssemblyAIClient("dummy-key");
ISpeechToTextClient speechClient = client;

// Access the underlying AssemblyAIClient from the MEAI interface.
var self = speechClient.GetService<AssemblyAIClient>();

Support

Priority place for bugs: https://github.com/tryAGI/AssemblyAI/issues
Priority place for ideas and general questions: https://github.com/tryAGI/AssemblyAI/discussions
Discord: https://discord.gg/Ca2xhfBf3v

Acknowledgments

This project is supported by JetBrains through the Open Source Support Program.