AI Observatory / Model Radar Nvidia / nvidia/llama-3.3-nemotron-super-49b-v1.5

NVIDIA: Llama 3.3 Nemotron Super 49B V1.5

Llama-3.3-Nemotron-Super-49B-v1.5 is a 49B-parameter, English-centric reasoning/chat model derived from Meta’s Llama-3.3-70B-Instruct with a 128K context. It’s post-trained for agentic workflows (RAG, tool calling) via SFT across math, code, science, and...

131,072 Context
16,384 Max output
$0.10 / 1M Prompt price
2025-10-10 Created
01 / Snapshot

Pricing, context, modalities, and parameters.

Model Radar detail pages stay neutral and operator-readable: core metadata first, then workflow fit.

Provider Nvidia Input modalities text
Output modalities text Prompt price $0.10 / 1M
Completion price $0.40 / 1M Request price N/A
Context length 131,072 Max completion tokens 16,384
Supported parameters frequency_penalty, include_reasoning, logit_bias, max_tokens, min_p, presence_penalty, reasoning, repetition_penalty, response_format, seed, stop, temperature, tool_choice, tools, top_k, top_p
Best for nvidia/llama-3.3-nemotron-super-49b-v1.5

NVIDIA: Llama 3.3 Nemotron Super 49B V1.5

Llama-3.3-Nemotron-Super-49B-v1.5 is a 49B-parameter, English-centric reasoning/chat model derived from Meta’s Llama-3.3-70B-Instruct with a 128K context. It’s post-trained for agentic workflows (RAG, tool calling) via SFT across math, code, science, and...

Deep analysis Coding workflows High-volume usage tool-capable low-cost
03 / Colophon

Routes and exits.

Each model page stays simple: overview, compare, related models, then back to the public hub.