Deferred Completions
Deferred completions allow you to submit chat completion requests for asynchronous processing. This is useful for non-time-sensitive workloads, batch-like processing at lower priority, or when you want to queue many requests and retrieve results later.
Quick Start
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | |
How It Works
- Submit a chat completion request with
deferred: true - The API returns immediately with a request ID (no choices yet)
- Poll using the request ID until results are ready
- Retrieve the completed response with choices populated
Manual Polling
For full control over the polling process:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | |
CreateDeferredAndWaitAsync Options
The CreateDeferredAndWaitAsync extension method handles submission and polling in one call:
| Parameter | Type | Default | Description |
|---|---|---|---|
request |
CreateChatCompletionRequest |
required | The chat completion request (deferred flag is set automatically) |
pollingInterval |
TimeSpan? |
5 seconds | How often to check for results |
timeout |
TimeSpan? |
10 minutes | Maximum total wait time |
cancellationToken |
CancellationToken |
default |
Cancellation token |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | |
The method throws:
TimeoutExceptionif the result is not ready within the timeout periodInvalidOperationExceptionif the initial submission does not return a request ID
Deferred vs. Batch API
| Feature | Deferred Completions | Batch API |
|---|---|---|
| Scope | Single request | Multiple requests |
| Submission | deferred: true parameter |
Create batch + add requests |
| Polling | Per-request ID | Per-batch ID |
| Use case | Individual async requests | Bulk processing |
| Helper | CreateDeferredAndWaitAsync |
Manual polling |
For processing many requests together, consider using the Batch API instead.