Llama 3.3 Nemotron Super 49B v1 (free) Check detailed information and pricing for AI models

Context Length 131,072 tokens, nvidia from provided

131,072

Context Tokens

Free

Prompt Price

Free

Output Price

9/16

Feature Support

Model Overview

Llama-3.3-Nemotron-Super-49B-v1 is a large language model (LLM) optimized for advanced reasoning, conversational interactions, retrieval-augmented generation (RAG), and tool-calling tasks. Derived from Meta's Llama-3.3-70B-Instruct, it employs a Neural Architecture Search (NAS) approach, significantly enhancing efficiency and reducing memory requirements. This allows the model to support a context length of up to 128K tokens and fit efficiently on single high-performance GPUs, such as NVIDIA H200. Note: you must include `detailed thinking on` in the system prompt to enable reasoning. Please see [Usage Recommendations](https://huggingface.co/nvidia/Llama-3_1-Nemotron-Ultra-253B-v1#quick-start-and-usage-recommendations) for more.

Basic Information

Developer

nvidia

Model Series

Other

Release Date

2025-04-08

Context Length

131,072 tokens

Variant

free

Pricing Information

This model is free to use

Data Policy

학습 정책

Supported Features

Supported (9)

Top K

Seed

Frequency Penalty

Presence Penalty

Repetition Penalty

Min P

Logit Bias

Logprobs

Top Logprobs

Unsupported (7)

Image Input

Response Format

Tool Usage

Structured Outputs

Reasoning

Web Search Options

Top A

Other Variants

Llama 3.3 Nemotron Super 49B v1

standard

$0.00 / $0.00

Actual Usage Statistics

No recent usage data available.

Models by Same Author (nvidia)

Nemotron Nano 9B V2 (free)

128,000 tokens

Free

View Details

Nemotron Nano 9B V2

131,072 tokens

$0.04 / $0.16

View Details

Llama 3.1 Nemotron Nano 8B v1

131,072 tokens

$0.00 / $0.00

View Details

Llama 3.1 Nemotron Ultra 253B v1 (free)

131,072 tokens

Free

View Details

Llama 3.1 Nemotron Ultra 253B v1

131,072 tokens

$0.60 / $1.80

View Details