The Speechmatics SDK implements ISpeechToTextClient and provides AIFunction tool wrappers, all compatible with Microsoft.Extensions.AI.
Installation
1
dotnetaddpackagetryAGI.Speechmatics
ISpeechToTextClient
The SpeechmaticsClient implements ISpeechToTextClient for batch speech-to-text transcription.
It uploads audio, submits a transcription job, and polls until completion. Supports 55+ languages.
File-Based Transcription
Transcribe an audio file to text. The client uploads the audio via multipart form data, creates a batch job, and polls until the transcript is ready:
The raw Speechmatics transcript is available via RawRepresentation for advanced access to recognition results, speaker diarization, timestamps, and other provider-specific features:
1 2 3 4 5 6 7 8 910111213141516171819202122232425
ISpeechToTextClientsttClient=client;usingvaraudioStream=File.OpenRead("audio.mp3");varresponse=awaitsttClient.GetTextAsync(audioStream);Console.WriteLine(response.Text);// Access full Speechmatics transcript for recognition-level detailvarraw=(TranscriptResponse)response.RawRepresentation!;Console.WriteLine($"Format: {raw.Format}");Console.WriteLine($"Results: {raw.Results?.Count}");// Iterate individual recognition results for word-level timestamps and confidenceif(raw.Resultsis{Count:>0}){foreach(varresultinraw.Results){if(result.Alternativesis{Count:>0}){varbest=result.Alternatives[0];Console.WriteLine($" [{result.StartTime:F2}s - {result.EndTime:F2}s] "+$"{best.Content} (confidence: {best.Confidence:P0}, speaker: {best.Speaker})");}}}
Advanced Configuration with RawRepresentationFactory
The Speechmatics batch API supports advanced features like diarization, summarization, translation, and custom vocabulary through the transcription config JSON. While the standard MEAI interface handles basic transcription, you can use RawRepresentationFactory to configure these features when submitting jobs directly via the client:
// For advanced configuration, use the SpeechmaticsClient directlyvarspeechmaticsClient=sttClient.GetService<SpeechmaticsClient>()!;// Submit a job with diarization, summarization, and custom vocabularyvarconfigJson="""{"type":"transcription","transcription_config":{"language":"en","diarization":"speaker","additional_vocab":[{"content":"Speechmatics","sounds_like":["speech matics"]}]},"summarization_config":{},"sentiment_analysis_config":{}}""";usingvaraudioStream=File.OpenRead("meeting.mp3");usingvarms=newMemoryStream();awaitaudioStream.CopyToAsync(ms);varjob=awaitspeechmaticsClient.CreateJobsAsync(config:configJson,dataFile:ms.ToArray(),dataFilename:"meeting.mp3");Console.WriteLine($"Job submitted: {job.Id}");
Streaming Behavior
GetStreamingTextAsync delegates to the non-streaming GetTextAsync method internally. The batch transcription job (submit + poll + get transcript) runs to completion first, and then the full result is yielded as a single TextUpdated update bracketed by SessionOpen and SessionClose events.
This means you will not receive incremental transcription updates as audio is processed. The entire transcript is returned at once after the job finishes. For most use cases, calling GetTextAsync directly is equivalent and simpler.
Note
Speechmatics does offer a real-time streaming WebSocket API, but it is not exposed through the MEAI ISpeechToTextClient interface. Use the SpeechmaticsClient directly for real-time streaming needs.
Accessing the Underlying Client
Retrieve the SpeechmaticsClient from the MEAI interface:
1234567
ISpeechToTextClientsttClient=client;varmetadata=sttClient.GetService<SpeechToTextClientMetadata>();Console.WriteLine($"Provider: {metadata?.ProviderName}");// "speechmatics"varspeechmaticsClient=sttClient.GetService<SpeechmaticsClient>();// Use speechmaticsClient for Speechmatics-specific APIs (usage stats, job management, etc.)
usingMicrosoft.Extensions.AI;usingSpeechmatics;varclient=newSpeechmaticsClient(apiKey:Environment.GetEnvironmentVariable("SPEECHMATICS_API_KEY")!);varoptions=newChatOptions{Tools=[client.AsTranscribeUrlTool()],};IChatClientchatClient=/* your chat client */;varmessages=newList<ChatMessage>{new(ChatRole.User,"Transcribe the audio at https://example.com/podcast.mp3"),};while(true){varresponse=awaitchatClient.GetResponseAsync(messages,options);messages.AddRange(response.ToChatMessages());if(response.FinishReason==ChatFinishReason.ToolCalls){varresults=awaitresponse.CallToolsAsync(options);messages.AddRange(results);continue;}Console.WriteLine(response.Text);break;}