> ## Documentation Index
> Fetch the complete documentation index at: https://docs-preprod.sambanova.ai/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Audio

For developers requiring audio support, SambaNova provides OpenAI’s Whisper large-v3 model, which enables real-time transcriptions and translations.

## Whisper-Large-v3

* **Model**: Whisper-Large-v3
* **Description**: State-of-the-art automatic speech recognition (ASR) and translation model. Developed by OpenAI and trained on 5M+ hours of labeled audio. Excels in multilingual and zero-shot speech tasks across diverse domains.
* **Model ID**: `Whisper-Large-v3`
* **Supported languages**: Multilingual

### Core capabilities

* Transcribes and translates extended audio inputs (up to 25 MB).
* Demonstrates high accuracy in speech recognition and translation tasks.
* Provides OpenAI-compatible endpoints for transcriptions and translations.

### Request parameters

| Parameter         | Type    | Description                                                                                                                      | Default  | Endpoints                        |
| :---------------- | :------ | :------------------------------------------------------------------------------------------------------------------------------- | :------- | :------------------------------- |
| `model`           | String  | The ID of the model to use.                                                                                                      | Required | `transcriptions`, `translations` |
| `file`            | File    | Audio file in FLAC, MP3, MP4, MPEG, MPGA, M4A, Ogg, WAV, or WebM format. File size limit: 25MB.                                  | Required | `transcriptions`, `translations` |
| `prompt`          | String  | Prompt to influence transcription style or vocabulary. Example: "Please transcribe carefully, including pauses and hesitations." | Optional | `transcriptions`, `translations` |
| `response_format` | String  | Output format: either `json` or `text`.                                                                                          | `json`   | `transcriptions`, `translations` |
| `language`        | String  | The language of the input audio. Using ISO-639-1 format (e.g., `en`) improves accuracy and latency.                              | Optional | `transcriptions`, `translations` |
| `stream`          | Boolean | Enables streaming responses.                                                                                                     | `false`  | `transcriptions`, `translations` |
| `stream_options`  | Object  | Additional streaming configuration (e.g., `{"include_usage": true}`).                                                            | Optional | `transcriptions`, `translations` |