Supported Engines

Foundation Model Directory

Comprehensive references of parameters, cutoffs, routing coordinates, and approximate flat costs (Demand dependent).

Claude Haiku 4.5 (Normal)claude-haiku-4-5

Anthropic

Claude Haiku 4.5 delivers high-speed responses and efficient execution for everyday language tasks.

Text Chat

Pricing / 1M

0.35 / 1.75

200K

Claude Haiku 4.5 Thinkingclaude-haiku-4-5-thinking

Anthropic

After using the "-thinking" suffix with claude-haiku-4-5, forced deep thinking will be enabled.

Text ChatDeep Thinking

Pricing / 1M

0.35 / 1.75

200K

GPT-5.4 (Normal)gpt-5.4

OpenAI

GPT5.4 delivers higher quality outputs with fewer iterations through APIs and Codex. It assists individuals and teams in analyzing complex information, building production software, and automating multi-step workflows.

Text Chat

Pricing / 1M

1.25 / 7.50

GPT-5.4 Lowgpt-5.4-low

OpenAI

GPT-5.4 deep thinking versions at different levels (low, medium, high, xhigh), with no suffix model having deep thinking set to None.

Text ChatDeep Thinking

Pricing / 1M

1.25 / 7.50

400K

GPT-5.4 Mediumgpt-5.4-medium

OpenAI

GPT-5.4 deep thinking versions at different levels (low, medium, high, xhigh), with no suffix model having deep thinking set to None.

Text ChatDeep Thinking

Pricing / 1M

1.25 / 7.50

400K

GPT-5.4 Highgpt-5.4-high

OpenAI

GPT-5.4 deep thinking versions at different levels (low, medium, high, xhigh), with no suffix model having deep thinking set to None.

Text ChatDeep Thinking

Pricing / 1M

1.25 / 7.50

400K

GPT-5.4 XHighgpt-5.4-xhigh

OpenAI

GPT-5.4 deep thinking versions at different levels (low, medium, high, xhigh), with no suffix model having deep thinking set to None.

Text ChatDeep Thinking

Pricing / 1M

1.25 / 7.50

400K

GPT-5.4 Minigpt-5.4-mini

OpenAI

GPT-5.4 mini enhances the advantages of GPT-5.4 in a faster model.

Text Chat

Pricing / 1M

0.375 / 2.25

400K

GPT-5.4 Mini Lowgpt-5.4-mini-low

OpenAI

GPT-5.4 mini enhances the advantages of GPT-5.4 in a faster model. Suffix -low configures reasoning effort to a low level.

Text ChatDeep Thinking

Pricing / 1M

0.375 / 2.25

400K

GPT-5.4 Mini Mediumgpt-5.4-mini-medium

OpenAI

GPT-5.4 mini enhances the advantages of GPT-5.4 in a faster model. Suffix -medium configures reasoning effort to a medium level.

Text ChatDeep Thinking

Pricing / 1M

0.375 / 2.25

400K

GPT-5.4 Mini Highgpt-5.4-mini-high

OpenAI

GPT-5.4 mini enhances the advantages of GPT-5.4 in a faster model. Suffix -high configures reasoning effort to a high level.

Text ChatDeep Thinking

Pricing / 1M

0.375 / 2.25

400K

GPT-5.4 Mini XHighgpt-5.4-mini-xhigh

OpenAI

GPT-5.4 mini enhances the advantages of GPT-5.4 in a faster model. Suffix -xhigh configures reasoning effort to an extremely high level.

Text ChatDeep Thinking

Pricing / 1M

0.375 / 2.25

400K

Claude Sonnet 4.6 (Normal)claude-sonnet-4-6

Anthropic

Claude Sonnet 4.6 balances advanced speed and reasoning capabilities for high-performance production workloads.

Text Chat

Pricing / 1M

1.05 / 5.25

200K

Claude Sonnet 4.6 Thinkingclaude-sonnet-4-6-thinking

Anthropic

Claude Sonnet-4.6's dynamic thinking model automatically decides whether to engage in deep thinking. Suffix -thinking forces standard reasoning.

Text ChatDeep Thinking

Pricing / 1M

1.05 / 5.25

200K

Claude Sonnet 4.6 Lowclaude-sonnet-4-6-low

Anthropic

Claude Sonnet-4.6's dynamic thinking model automatically decides whether to engage in deep thinking. Suffix -low configures reasoning effort to a low level.

Text ChatDeep Thinking

Pricing / 1M

1.05 / 5.25

200K

Claude Sonnet 4.6 Mediumclaude-sonnet-4-6-medium

Anthropic

Claude Sonnet-4.6's dynamic thinking model automatically decides whether to engage in deep thinking. Suffix -medium configures reasoning effort to a medium level.

Text ChatDeep Thinking

Pricing / 1M

1.05 / 5.25

200K

Claude Sonnet 4.6 Maxclaude-sonnet-4-6-max

Anthropic

Claude Sonnet-4.6's new dynamic thinking model automatically decides whether to engage in deep thinking based on prompts, with suffixes like -max, -high, -medium, and -low determining the thinking level.

Text ChatDeep Thinking

Pricing / 1M

1.05 / 5.25

200K

Claude Opus 4.6 (Normal)claude-opus-4-6

Anthropic

Claude Opus 4.6 is Anthropic's powerful frontier model, delivering excellent analysis, reasoning, and detailed generation outputs.

Text Chat

Pricing / 1M

1.75 / 8.75

200K

Claude Opus 4.6 Thinkingclaude-opus-4-6-thinking

Anthropic

The new Claude-Opus-4-6 dynamic thinking model automatically decides whether to engage in deep thinking. Suffix -thinking forces standard reasoning.

Text ChatDeep Thinking

Pricing / 1M

1.75 / 8.75

200K

Claude Opus 4.6 Lowclaude-opus-4-6-low

Anthropic

The new Claude-Opus-4-6 dynamic thinking model automatically decides whether to engage in deep thinking. Suffix -low configures reasoning effort to a low level.

Text ChatDeep Thinking

Pricing / 1M

1.75 / 8.75

200K

Claude Opus 4.6 Mediumclaude-opus-4-6-medium

Anthropic

The new Claude-Opus-4-6 dynamic thinking model automatically decides whether to engage in deep thinking. Suffix -medium configures reasoning effort to a medium level.

Text ChatDeep Thinking

Pricing / 1M

1.75 / 8.75

200K

Claude Opus 4.6 Maxclaude-opus-4-6-max

Anthropic

The new Claude-Opus-4-6 dynamic thinking model automatically decides whether to engage in deep thinking based on prompts, with suffixes such as -max, -high, -medium, and -low determining the level of thinking.

Text ChatDeep Thinking

Pricing / 1M

1.75 / 8.75

200K

Claude Opus 4.8 (Normal)claude-opus-4-8

Anthropic

The new dynamic thinking model of Claude-Opus-4-8 automatically determines whether deep thinking is required based on prompts.

Text Chat

Pricing / 1M

1.75 / 8.75

Claude Opus 4.8 Thinkingclaude-opus-4-8-thinking

Anthropic

The new dynamic thinking model of Claude-Opus-4-8 automatically determines whether deep thinking is required based on prompts. Suffix -thinking forces standard reasoning.

Text ChatDeep Thinking

Pricing / 1M

1.75 / 8.75

Claude Opus 4.8 Lowclaude-opus-4-8-low

Anthropic

The new dynamic thinking model of Claude-Opus-4-8 automatically determines whether deep thinking is required based on prompts. Suffix -low configures reasoning effort to a low level.

Text ChatDeep Thinking

Pricing / 1M

1.75 / 8.75

Claude Opus 4.8 Mediumclaude-opus-4-8-medium

Anthropic

The new dynamic thinking model of Claude-Opus-4-8 automatically determines whether deep thinking is required based on prompts. Suffix -medium configures reasoning effort to a medium level.

Text ChatDeep Thinking

Pricing / 1M

1.75 / 8.75

Claude Opus 4.8 Maxclaude-opus-4-8-max

Anthropic

The new dynamic thinking model of Claude-Opus-4-8 automatically determines whether deep thinking is required based on prompts. Suffixes such as -max, -high, -medium, and -low indicate the level of reasoning.

Text ChatDeep Thinking

Pricing / 1M

1.75 / 8.75

Gemini 3.5 Flash (Normal)gemini-3.5-flash

Gemini

Gemini 3.5 Flash delivers cutting-edge intelligence optimized for fast and cost-effective task handling.

Text Chat

Pricing / 1M

1.50 / 9.00

Gemini 3.5 Flash Searchgemini-3.5-flash-search

Gemini

Gemini 3.5 Flash configured with real-time web search capabilities for ground-truth information lookups.

Text Chat

Pricing / 1M

1.50 / 9.00

Gemini 3.5 Flash Thinkinggemini-3.5-flash-thinking

Gemini

Gemini 3.5 Flash with reasoning capabilities enabled for complex multi-step logical tasks.

Text ChatDeep Thinking

Pricing / 1M

1.50 / 9.00

Gemini 3.5 Flash NoThinkinggemini-3.5-flash-nothinking

Gemini

Gemini 3.5 Flash configured to run without reasoning mode for maximum speed and raw generation rates.

Text Chat

Pricing / 1M

1.50 / 9.00

GPT-5.5gpt-5.5

OpenAI

GPT-5.5 is our OpenAI frontier model for the most complex professional work. Reasoning.effort supports: none, low, medium (default), high and xhigh.

Text ChatDeep Thinking

Pricing / 1M

2.50 / 15.00

DeepSeek V4 FlashDeepSeek-V4-Flash

DeepSeek

DeepSeek V4 Flash is a hyper-fast, lightweight model optimized for latency-critical tasks and rapid query responses at a flat rate.

Text Chat

Flat Cost

0.02 / req

DeepSeek V4 Prodeepseek-v4-pro

DeepSeek

DeepSeek V4 Pro is DeepSeek's advanced reasoning engine, delivering superior coding, mathematics, and complex reasoning capabilities at a flat rate.

Text ChatDeep Thinking

Flat Cost

0.08 / req

Grok 4.2grok-4.2

X.AI

Compared to previous versions, Grok 4.2 has made significant improvements in rapid learning, multi-agent collaboration, multimodal processing, and real-time information acquisition.

Text Chat

Flat Cost

0.025 / req

256K

Grok 4.1grok-4.1

X.AI

No description available!

Text Chat

Flat Cost

0.025 / req

256K

qwen3.7-maxqwen3.7-max

Alibaba

Qwen3.7 is a next-generation flagship model designed for the era of intelligent agents, with its core strengths lying in the breadth and depth of its agent capabilities: it excels at a wide range of tasks in programming, office work and productivity, as well as long-term autonomous execution.

Text Chat

Pricing / 1M

2.40 / 7.20

qwen3.7-plusqwen3.7-plus

Alibaba

The high cost-performance Plus model in the Qwen3.7 series has comprehensively upgraded its vision-language capabilities on top of its strong text abilities, while maintaining full agent capabilities in coding, tool usage, and productivity workflows.

Text ChatDeep Thinking

Pricing / 1M

0.80 / 3.20

glm-5.2glm-5.2

Zhipu

GLM-5.2 is the flagship model for the era of long tasks. It supports a truly usable 1M context, can handle project-level engineering context, executes long-range tasks more stably, follows engineering standards more reliably, and can complete the entire development process from requirements to multi-end deployment in one go.

Text ChatDeep Thinking

Pricing / 1M

4.00 / 14.00