Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.verdent.ai/llms.txt

Use this file to discover all available pages before exploring further.

This page applies to both Verdent Desktop and Verdent for VS Code.

Overview

Verdent integrates state-of-the-art large language models from the world’s leading AI labs, including Anthropic (Claude), OpenAI (GPT), Google (Gemini), Moonshot (Kimi), Zhipu AI (GLM), and MiniMax. To help users understand the cost behind every AI interaction, we fully disclose the provider pricing for all available models in this document. All prices listed are the official published prices from each model provider, denominated in US dollars ($) per one million tokens (1M tokens).

Key Concepts

Tokens

A token is the fundamental unit of text processing for large language models. One token is approximately 4 English characters or 1-2 Chinese characters. Model pricing is based on the number of input and output tokens consumed, billed separately.

Billing Model

All models on Verdent currently use a per-token billing model, meaning you are charged based on the actual number of input and output tokens consumed during each interaction.

Price Components

Each model’s pricing consists of the following dimensions:
  • Input Price: The per-token cost for the prompt (user message and context) sent to the model.
  • Output Price: The per-token cost for the response generated by the model. Typically higher than input price.
  • Cache Write Price: Some models support prompt caching. This is the per-token cost when creating a cache entry for the first time.
  • Cache Read Price: The per-token cost when hitting an existing cache entry. Typically much lower than the standard input price, effectively reducing costs for repeated contexts.

Model Pricing Details

Below are the provider prices for all models currently available on Verdent, organized by provider. All prices are in USD per 1M tokens.

Anthropic (Claude Series)

ModelInput ($/1M)Output ($/1M)Cache Write ($/1M)Cache Read ($/1M)
Claude Opus 4.7$5.00$25.00$6.25$0.50
Claude Opus 4.6$5.00$25.00$6.25$0.50
Claude Sonnet 4.6$3.00$15.00$3.75$0.30
Claude Haiku 4.5$1.00$5.00$1.25$0.10
Claude Opus 4.7 (claude-opus-4-7) The latest Opus model available on Verdent. It is designed for the most demanding reasoning, architecture, and deep analysis work, with the same provider pricing structure as Opus 4.6. Claude Opus 4.6 (claude-opus-4-6) Flagship-tier Opus model for complex code architecture, deep analysis, and difficult problem-solving tasks. Claude Sonnet 4.6 (claude-sonnet-4-6) Balanced model with strong performance and competitive pricing; recommended for everyday development. Claude Haiku 4.5 (claude-haiku-4-5@20251001) Fast and lightweight model with the quickest response times. Best for simple conversations, quick lookups, and low-latency scenarios.

OpenAI (GPT Series)

ModelInput ($/1M)Output ($/1M)Cache Write ($/1M)Cache Read ($/1M)
GPT-5.5$5.00$30.00Free$0.50
GPT-5.4$2.50$15.00Free$0.25
GPT-5.3 Codex$1.75$14.00Free$0.17
GPT-5.5 (gpt-5.5) The latest GPT model available on Verdent, suited to frontier reasoning, code generation, and multi-step analysis workloads. GPT-5.4 (gpt-5.4) Flagship GPT model with strong reasoning and code generation quality. GPT-5.3 Codex (gpt-5.3-codex) Code-specialized model optimized for programming tasks, including large-scale code generation and refactoring.

Google (Gemini Series)

ModelInput ($/1M)Output ($/1M)Cache Write ($/1M)Cache Read ($/1M)
Gemini 3.1 Pro$2.00$12.00-$0.20
Gemini 3 Flash$0.50$3.00-$0.050
Gemini 3.1 Pro (gemini-3.1-pro-preview) Professional-grade model with strong reasoning, suitable for complex analysis and deep thinking tasks. Gemini 3 Flash (gemini-3-flash-preview) Ultra-fast model with excellent cost efficiency. Ideal for high-volume batch processing.

Moonshot (Kimi Series)

ModelInput ($/1M)Output ($/1M)Cache Write ($/1M)Cache Read ($/1M)
Kimi K2.6$0.95$4.00-$0.16
Kimi K2.5$0.60$3.00-$0.10
Kimi K2.6 (kimi-k2.6) The latest Kimi model available on Verdent, with stronger reasoning and bilingual coding performance than the previous K2.5 generation. Kimi K2.5 (kimi-k2.5) Efficient Kimi model with strong bilingual capabilities and competitive pricing for everyday development tasks.

Zhipu AI (GLM Series)

ModelInput ($/1M)Output ($/1M)Cache Write ($/1M)Cache Read ($/1M)
GLM-5.1$1.40$4.40-$0.26
GLM-5.1 (glm-5.1) The latest GLM model available on Verdent, with upgraded general-purpose reasoning and strong Chinese-language performance.

MiniMax

ModelInput ($/1M)Output ($/1M)Cache Write ($/1M)Cache Read ($/1M)
MiniMax M2.7$0.30$1.20$0.38$0.060
MiniMax M2.5$0.30$1.20$0.38$0.030
MiniMax M2.7 (MiniMax-M2.7) Latest version with comprehensive performance improvements, maintaining highly competitive pricing. MiniMax M2.5 (MiniMax-M2.5) High cost-efficiency model, ideal for cost-sensitive, high-volume processing scenarios.

Pricing Overview

The table below summarizes the core pricing for all models, sorted by output price descending for easy comparison:
ModelProviderInput ($/1M)Output ($/1M)Cache Write ($/1M)Cache Read ($/1M)
GPT-5.5OpenAI$5.00$30.00-$0.50
Opus 4.7Anthropic$5.00$25.00$6.25$0.50
Opus 4.6Anthropic$5.00$25.00$6.25$0.50
GPT-5.4OpenAI$2.50$15.00-$0.25
Sonnet 4.6Anthropic$3.00$15.00$3.75$0.30
GPT-5.3-CodexOpenAI$1.75$14.00-$0.17
Gemini 3.1 ProGoogle$2.00$12.00-$0.20
Haiku 4.5Anthropic$1.00$5.00$1.25$0.10
GLM-5.1Zhipu AI$1.40$4.40-$0.26
Kimi K2.6Moonshot$0.95$4.00-$0.16
Gemini 3 FlashGoogle$0.50$3.00-$0.050
Kimi K2.5Moonshot$0.60$3.00-$0.10
MiniMax M2.5MiniMax$0.30$1.20$0.38$0.030
MiniMax M2.7MiniMax$0.30$1.20$0.38$0.060