> ## Documentation Index
> Fetch the complete documentation index at: https://docs-preprod.sambanova.ai/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# SambaStack models

SambaStack supports a variety of models that can be deployed to both on-premises and hosted environments. Contact your system administrator to determine which models are available on your deployment. You can also use the [Model list API command](/api-reference/endpoints/model-list) to view which models are deployed and available for your use.

## Deployment options

When deploying models in SambaStack, administrators can select from various context length and batch size combinations.

* Smaller batch sizes provide higher token throughput (tokens/second).
* Larger batch sizes provide better concurrency for multiple users.

## Supported models

The table below lists supported models, context lengths, batch sizes, and features.

| Developer/Model ID                    | Type            | Context length (batch size)                                                                                                                                                  | Features and optimizations                                                                                                                                                                                          | View on Hugging Face                                                                   |
| :------------------------------------ | :-------------- | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | :------------------------------------------------------------------------------------- |
| **Meta**                              |                 |                                                                                                                                                                              |                                                                                                                                                                                                                     |                                                                                        |
| `Meta-Llama-3.3-70B-Instruct`         | Text            | <details><summary>View</summary><ul><li>4K (1,2,4,8,16,32)</li><li>8K (1,2,4,8)</li><li>16K (1,2,4)</li><li>32K (1,2,4)</li><li>64K (1)</li><li>128K (1)</li></ul></details> | <details><summary>View</summary><ul><li>Endpoint: Chat completions</li><li>Capabilities: Function calling, JSON mode</li><li>Import checkpoint: Yes</li><li>Optimizations: Speculative decoding</li></ul></details> | [Model card](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct)                 |
| `Meta-Llama-3.1-8B-Instruct`          | Text            | <details><summary>View</summary><ul><li>4K (1,2,4,8)</li><li>8K (1,2,4,8)</li><li>16K (1,2,4)</li></ul></details>                                                            | <details><summary>View</summary><ul><li>Endpoint: Chat completions</li><li>Capabilities: Function calling, JSON mode</li><li>Import checkpoint: Yes</li><li>Optimizations: None</li></ul></details>                 | [Model card](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct)                  |
| `Llama-4-Maverick-17B-128E-Instruct`  | Image, Text     | <details><summary>View</summary><ul><li>4K (1,4)</li><li>8K (1)</li></ul></details>                                                                                          | <details><summary>View</summary><ul><li>Endpoint: Chat completions</li><li>Capabilities: Function calling, JSON mode</li><li>Import checkpoint: No</li><li>Optimizations: None</li></ul></details>                  | [Model card](https://huggingface.co/meta-llama/Llama-4-Maverick-17B-128E-Instruct)     |
| **DeepSeek**                          |                 |                                                                                                                                                                              |                                                                                                                                                                                                                     |                                                                                        |
| `DeepSeek-R1-0528`                    | Reasoning, Text | <details><summary>View</summary><ul><li>4K (4)</li><li>8K (1)</li><li>16K (1)</li><li>32K (1)</li></ul></details>                                                            | <details><summary>View</summary><ul><li>Endpoint: Chat completions</li><li>Capabilities: Function calling, JSON mode</li><li>Import checkpoint: No</li><li>Optimizations: None</li></ul></details>                  | [Model card](https://huggingface.co/deepseek-ai/DeepSeek-R1)                           |
| `DeepSeek-R1-Distill-Llama-70B`       | Reasoning, Text | <details><summary>View</summary><ul><li>4K (1,2,4,8,16,32)</li><li>8K (1,2,4,8)</li><li>16K (1,2,4)</li><li>32K (1,2,4)</li><li>64K (1)</li><li>128K (1)</li></ul></details> | <details><summary>View</summary><ul><li>Endpoint: Chat completions</li><li>Capabilities: None</li><li>Import checkpoint: Yes</li><li>Optimizations: Speculative decoding</li></ul></details>                        | [Model card](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-70B)         |
| `DeepSeek-V3-0324`                    | Text            | <details><summary>View</summary><ul><li>4K (4)</li><li>8K (1)</li><li>16K (1)</li><li>32K (1)</li></ul></details>                                                            | <details><summary>View</summary><ul><li>Endpoint: Chat completions</li><li>Capabilities: Function calling, JSON mode</li><li>Import checkpoint: No</li><li>Optimizations: None</li></ul></details>                  | [Model card](https://huggingface.co/deepseek-ai/DeepSeek-V3-0324)                      |
| `DeepSeek-V3.1`                       | Reasoning, Text | <details><summary>View</summary><ul><li>4K (4)</li><li>8K (1)</li><li>16K (1)</li><li>32K (1)</li></ul></details>                                                            | <details><summary>View</summary><ul><li>Endpoint: Chat completions</li><li>Capabilities: Function calling, JSON mode</li><li>Import checkpoint: No</li><li>Optimizations: None</li></ul></details>                  | [Model card](https://huggingface.co/deepseek-ai/DeepSeek-V3.1)                         |
| **OpenAI**                            |                 |                                                                                                                                                                              |                                                                                                                                                                                                                     |                                                                                        |
| `Whisper-Large-v3`                    | Audio           | <details><summary>View</summary><ul><li>4K (1,16,32)</li></ul></details>                                                                                                     | <details><summary>View</summary><ul><li>Endpoint: Translation, Transcription</li><li>Capabilities: None</li><li>Import checkpoint: No</li><li>Optimizations: None</li></ul></details>                               | [Model card](https://huggingface.co/openai/whisper-large-v3)                           |
| **Qwen**                              |                 |                                                                                                                                                                              |                                                                                                                                                                                                                     |                                                                                        |
| `Qwen3-32B`                           | Reasoning, Text | <details><summary>View</summary><ul><li>8K (1)</li></ul></details>                                                                                                           | <details><summary>View</summary><ul><li>Endpoint: Chat completions</li><li>Capabilities: None</li><li>Import checkpoint: No</li><li>Optimizations: None</li></ul></details>                                         | [Model card](https://huggingface.co/Qwen/Qwen3-32B)                                    |
| **Tokyotech-llm**                     |                 |                                                                                                                                                                              |                                                                                                                                                                                                                     |                                                                                        |
| `Llama-3.3-Swallow-70B-Instruct-v0.4` | Text            | <details><summary>View</summary><ul><li>4K (1,2,4,8,16)</li><li>8K (1,2,4,8,16)</li><li>16K (1,2,4)</li><li>32K (1,2,4)</li><li>64K (1)</li><li>128K (1)</li></ul></details> | <details><summary>View</summary><ul><li>Endpoint: Chat completions</li><li>Capabilities: None</li><li>Import checkpoint: No</li><li>Optimizations: Speculative decoding</li></ul></details>                         | [Model card](https://huggingface.co/tokyotech-llm/Llama-3.3-Swallow-70B-Instruct-v0.4) |
| **Other**                             |                 |                                                                                                                                                                              |                                                                                                                                                                                                                     |                                                                                        |
| `E5-Mistral-7B-Instruct`              | Embedding       | <details><summary>View</summary><ul><li>4K (1,2,4,8,16,32)</li></ul></details>                                                                                               | <details><summary>View</summary><ul><li>Endpoint: Embeddings</li><li>Capabilities: None</li><li>Import checkpoint: No</li><li>Optimizations: None</li></ul></details>                                               | [Model card](https://huggingface.co/intfloat/e5-mistral-7b-instruct)                   |
