The full AssemblyAI transcript is available via RawRepresentation for speaker labels, sentiment analysis, entities, chapters, and word-level timestamps:
ISpeechToTextClientsttClient=client;varresponse=awaitsttClient.GetTextAsync(newSpeechToTextOptions{AudioData=newDataContent(audioBytes,"audio/mpeg"),});// Access the full AssemblyAI Transcript objectvarraw=(Transcript)response.RawRepresentation!;Console.WriteLine($"Confidence: {raw.Confidence}");Console.WriteLine($"Audio Duration: {raw.AudioDuration}s");// Access word-level timestampsif(raw.Wordsis{Count:>0}){foreach(varwordinraw.Words){Console.WriteLine($" [{word.Start}ms - {word.End}ms] {word.Text} "+$"(confidence: {word.Confidence:P0}, speaker: {word.Speaker})");}}// Access utterances (when speaker_labels is enabled)if(raw.Utterancesis{Count:>0}){foreach(varutteranceinraw.Utterances){Console.WriteLine($" Speaker {utterance.Speaker}: {utterance.Text}");}}
Streaming Behavior
GetStreamingTextAsync delegates to the non-streaming GetTextAsync method internally. The batch transcription job (upload + submit + poll) runs to completion first, and then the full result is converted to SpeechToTextResponseUpdate events using ToSpeechToTextResponseUpdates().
This means you will not receive incremental transcription updates as audio is processed. The entire transcript is returned at once after the job finishes. For most use cases, calling GetTextAsync directly is equivalent and simpler.
Note
AssemblyAI does offer a real-time streaming WebSocket API via the AssemblyAIRealtimeClient (in the AssemblyAI.Realtime namespace), but it is not exposed through the MEAI ISpeechToTextClient interface. Use AssemblyAIRealtimeClient directly for real-time streaming needs.