Llama 3.3 Nemotron Super 49B v1 Check detailed information and pricing for AI models

Context Length 131,072 tokens, nvidia from provided

131,072

Context Tokens

$0.00

Prompt Price

$0.00

Output Price

0/16

Feature Support

Model Overview

Llama-3.3-Nemotron-Super-49B-v1 is a large language model (LLM) optimized for advanced reasoning, conversational interactions, retrieval-augmented generation (RAG), and tool-calling tasks. Derived from Meta's Llama-3.3-70B-Instruct, it employs a Neural Architecture Search (NAS) approach, significantly enhancing efficiency and reducing memory requirements. This allows the model to support a context length of up to 128K tokens and fit efficiently on single high-performance GPUs, such as NVIDIA H200. Note: you must include `detailed thinking on` in the system prompt to enable reasoning. Please see [Usage Recommendations](https://huggingface.co/nvidia/Llama-3_1-Nemotron-Ultra-253B-v1#quick-start-and-usage-recommendations) for more.

Basic Information

Developer

nvidia

Model Series

Other

Release Date

2025-04-08

Context Length

131,072 tokens

Variant

standard

Pricing Information

Prompt Tokens

$0.00 / 1M tokens

Completion Tokens

$0.00 / 1M tokens

Supported Features

Unsupported (16)

Image Input

Top K

Seed

Frequency Penalty

Presence Penalty

Repetition Penalty

Response Format

Min P

Logit Bias

Tool Usage

Logprobs

Top Logprobs

Structured Outputs

Reasoning

Web Search Options

Top A

Other Variants

Llama 3.3 Nemotron Super 49B v1 (free)

free

Free

Actual Usage Statistics

#210

Out of 353 total models

423.85M

Total Tokens Last 30 Days

14.13M

Daily Average Usage

Weekly Usage Change

Usage Trend for the Last 30 Days

Models by Same Author (nvidia)

Nemotron Nano 9B V2 (free)

128,000 tokens

Free

View Details

Nemotron Nano 9B V2

131,072 tokens

$0.04 / $0.16

View Details

Llama 3.1 Nemotron Nano 8B v1

131,072 tokens

$0.00 / $0.00

View Details

Llama 3.1 Nemotron Ultra 253B v1 (free)

131,072 tokens

Free

View Details