Llama 3.1 Nemotron 70B Instruct Check detailed information and pricing for AI models

Context Length 131,072 tokens, nvidia from provided

131,072
Context Tokens
$0.12
Prompt Price
$0.30
Output Price
11/16
Feature Support

Model Overview

NVIDIA's Llama 3.1 Nemotron 70B is a language model designed for generating precise and useful responses. Leveraging [Llama 3.1 70B](/models/meta-llama/llama-3.1-70b-instruct) architecture and Reinforcement Learning from Human Feedback (RLHF), it excels in automatic alignment benchmarks. This model is tailored for applications requiring high accuracy in helpfulness and response generation, suitable for diverse user queries across multiple domains. Usage of this model is subject to [Meta's Acceptable Use Policy](https://www.llama.com/llama3/use-policy/).

Basic Information

Developer
nvidia
Model Series
Llama3
Release Date
2024-10-15
Context Length
131,072 tokens
Max Completion Tokens
131,072 tokens
Variant
standard

Pricing Information

Prompt Tokens
$0.12 / 1M tokens
Completion Tokens
$0.30 / 1M tokens

Supported Features

Supported (11)

Top K
Seed
Frequency Penalty
Presence Penalty
Repetition Penalty
Response Format
Min P
Logit Bias
Tool Usage
Logprobs
Top Logprobs

Unsupported (5)

Image Input
Structured Outputs
Reasoning
Web Search Options
Top A

Actual Usage Statistics

#153
Out of 345 total models
924.96M
Total Tokens Last 30 Days
30.83M
Daily Average Usage
2%
Weekly Usage Change

Usage Trend for the Last 30 Days

Models by Same Author (nvidia)

Llama 3.1 Nemotron Nano 8B v1
131,072 tokens
$0.00 / $0.00
Llama 3.3 Nemotron Super 49B v1 (free)
131,072 tokens
Free
Llama 3.3 Nemotron Super 49B v1
131,072 tokens
$0.13 / $0.40
Llama 3.1 Nemotron Ultra 253B v1 (free)
131,072 tokens
Free
Llama 3.1 Nemotron Ultra 253B v1
131,072 tokens
$0.60 / $1.80

Similar Price Range Models

Hermes 3 70B Instruct
nousresearch
131,072 tokens
$0.12 / $0.30
Qwen3 32B
qwen
40,960 tokens
$0.10 / $0.30
Llama 3.1 70B Instruct
meta-llama
131,072 tokens
$0.10 / $0.28