mirror of
https://github.com/Skyvern-AI/skyvern.git
synced 2026-04-28 11:40:32 +00:00
- Move Developer-tab pages under /developers/* (getting-started/, features/, browser-automations/, credentials/, optimization/, going-to-production/, debugging/, self-hosted/) so URLs mirror Cloud's /cloud/* prefix; add wildcard redirects for the legacy paths and update existing legacy redirects to point at the new locations - Cloud UI nav: place Workflows above Tasks, promote Workflow Blocks to a top-level group, and add MCP under Integrations - Developers nav: also promote Workflow Blocks (actions-reference) to a top-level group - Rewrite cloud/getting-started/core-concepts as a UI tour (no code, dashboard screenshots) - Changelog: stable id anchors per Update so sidebar links work, and backfill v1.0.8–v1.0.14 plus v1.0.19/v1.0.20 from upstream release notes
610 lines
16 KiB
Text
610 lines
16 KiB
Text
---
|
|
title: LLM Configuration
|
|
subtitle: Connect your preferred language model provider
|
|
description: Configure LLM providers for self-hosted Skyvern including OpenAI, Anthropic, Azure OpenAI, Google Gemini, AWS Bedrock, Ollama, and OpenAI-compatible endpoints via LiteLLM.
|
|
slug: developers/self-hosted/llm-configuration
|
|
keywords:
|
|
- OpenAI
|
|
- Anthropic
|
|
- Azure OpenAI
|
|
- Google Gemini
|
|
- AWS Bedrock
|
|
- Ollama
|
|
- LiteLLM
|
|
- LLM_KEY
|
|
- SECONDARY_LLM_KEY
|
|
---
|
|
|
|
Skyvern uses LLMs to analyze screenshots and decide what actions to take. You'll need to configure at least one LLM provider before running tasks.
|
|
|
|
## How Skyvern uses LLMs
|
|
|
|
Skyvern makes multiple LLM calls per task step:
|
|
1. **Screenshot analysis**: Identify interactive elements on the page
|
|
2. **Action planning**: Decide what to click, type, or extract
|
|
3. **Result extraction**: Parse data from the page into structured output
|
|
|
|
A task that runs for 10 steps makes roughly 30+ LLM calls. Choose your provider and model tier with this in mind.
|
|
|
|
For most deployments, configure a single provider using `LLM_KEY`. Skyvern also supports a `SECONDARY_LLM_KEY` for lighter tasks to reduce costs.
|
|
|
|
---
|
|
|
|
## Quick Start Recommendations
|
|
|
|
**Best models for production (2025):**
|
|
|
|
| Provider | Primary Model | Secondary Model | Notes |
|
|
|----------|--------------|-----------------|-------|
|
|
| **Anthropic** | `ANTHROPIC_CLAUDE4.5_OPUS` | `ANTHROPIC_CLAUDE4.5_SONNET` | Most capable |
|
|
| **OpenAI** | `OPENAI_GPT5` | `OPENAI_GPT5_MINI` | Latest |
|
|
| **Google** | `GEMINI_3_PRO` | `GEMINI_3.0_FLASH` | Latest |
|
|
| **AWS Bedrock** | `BEDROCK_ANTHROPIC_CLAUDE4.5_OPUS_INFERENCE_PROFILE` | `BEDROCK_ANTHROPIC_CLAUDE4.5_SONNET_INFERENCE_PROFILE` | Latest Claude |
|
|
|
|
<Tip>
|
|
**New in 2025:** GPT-5 series, Claude 4.6 Opus, Gemini 3, Amazon Nova, and many new open-source models via Novita and VolcEngine.
|
|
</Tip>
|
|
|
|
---
|
|
|
|
## OpenAI
|
|
|
|
The most common choice. Requires an API key from [platform.openai.com](https://platform.openai.com/).
|
|
|
|
```bash .env
|
|
ENABLE_OPENAI=true
|
|
OPENAI_API_KEY=sk-...
|
|
LLM_KEY=OPENAI_GPT4O
|
|
```
|
|
|
|
### Available models
|
|
|
|
| LLM_KEY | Notes |
|
|
|---------|-------|
|
|
| **GPT-5 Series** | |
|
|
| `OPENAI_GPT5` | Recommended for most complex tasks |
|
|
| `OPENAI_GPT5_MINI` | |
|
|
| `OPENAI_GPT5_MINI_FLEX` | Flex service tier, 15min timeout |
|
|
| `OPENAI_GPT5_NANO` | |
|
|
| `OPENAI_GPT5_1` | |
|
|
| `OPENAI_GPT5_2` | |
|
|
| `OPENAI_GPT5_4` | |
|
|
| **GPT-4 Series** | |
|
|
| `OPENAI_GPT4O` | |
|
|
| `OPENAI_GPT4O_MINI` | |
|
|
| `OPENAI_GPT4_1` | |
|
|
| `OPENAI_GPT4_1_MINI` | |
|
|
| `OPENAI_GPT4_1_NANO` | |
|
|
| `OPENAI_GPT4_5` | |
|
|
| `OPENAI_GPT4_TURBO` | Legacy |
|
|
| `OPENAI_GPT4V` | Legacy alias |
|
|
| **O-Series (Reasoning)** | |
|
|
| `OPENAI_O4_MINI` | Vision support |
|
|
| `OPENAI_O3` | Vision support |
|
|
| `OPENAI_O3_MINI` | No vision |
|
|
|
|
### Optional settings
|
|
|
|
```bash .env
|
|
# Use a custom API endpoint (for proxies or compatible services)
|
|
OPENAI_API_BASE=https://your-proxy.com/v1
|
|
|
|
# Specify organization ID
|
|
OPENAI_ORGANIZATION=org-...
|
|
```
|
|
|
|
---
|
|
|
|
## Anthropic
|
|
|
|
Claude models from [anthropic.com](https://www.anthropic.com/).
|
|
|
|
```bash .env
|
|
ENABLE_ANTHROPIC=true
|
|
ANTHROPIC_API_KEY=sk-ant-...
|
|
LLM_KEY=ANTHROPIC_CLAUDE3.5_SONNET
|
|
```
|
|
|
|
### Available models
|
|
|
|
| LLM_KEY | Notes |
|
|
|---------|-------|
|
|
| **Claude 4.6** | |
|
|
| `ANTHROPIC_CLAUDE4.6_OPUS` | Newest |
|
|
| **Claude 4.5** | |
|
|
| `ANTHROPIC_CLAUDE4.5_OPUS` | Recommended for primary use |
|
|
| `ANTHROPIC_CLAUDE4.5_SONNET` | Recommended for secondary use |
|
|
| `ANTHROPIC_CLAUDE4.5_HAIKU` | Fastest |
|
|
| **Claude 4** | |
|
|
| `ANTHROPIC_CLAUDE4_OPUS` | |
|
|
| `ANTHROPIC_CLAUDE4_SONNET` | |
|
|
| **Claude 3.7** | |
|
|
| `ANTHROPIC_CLAUDE3.7_SONNET` | |
|
|
| **Claude 3.5** | |
|
|
| `ANTHROPIC_CLAUDE3.5_SONNET` | |
|
|
| `ANTHROPIC_CLAUDE3.5_HAIKU` | |
|
|
| **Claude 3 (Legacy)** | |
|
|
| `ANTHROPIC_CLAUDE3_OPUS` | |
|
|
| `ANTHROPIC_CLAUDE3_SONNET` | |
|
|
| `ANTHROPIC_CLAUDE3_HAIKU` | |
|
|
|
|
---
|
|
|
|
## Azure OpenAI
|
|
|
|
Microsoft-hosted OpenAI models. Requires an Azure subscription with OpenAI service provisioned.
|
|
|
|
```bash .env
|
|
ENABLE_AZURE=true
|
|
LLM_KEY=AZURE_OPENAI
|
|
AZURE_DEPLOYMENT=your-deployment-name
|
|
AZURE_API_KEY=your-azure-api-key
|
|
AZURE_API_BASE=https://your-resource.openai.azure.com/
|
|
AZURE_API_VERSION=2024-08-01-preview
|
|
```
|
|
|
|
### Setup steps
|
|
|
|
1. Create an Azure OpenAI resource in the [Azure Portal](https://portal.azure.com)
|
|
2. Open the Azure AI Foundry portal from your resource's overview page
|
|
3. Go to **Shared Resources** → **Deployments**
|
|
4. Click **Deploy Model** → **Deploy Base Model** → select GPT-4o or GPT-4
|
|
5. Note the **Deployment Name**. Use this for `AZURE_DEPLOYMENT`
|
|
6. Copy your API key and endpoint from the Azure Portal
|
|
|
|
<Note>
|
|
The `AZURE_DEPLOYMENT` is the name you chose when deploying the model, not the model name itself.
|
|
</Note>
|
|
|
|
---
|
|
|
|
## Google Gemini
|
|
|
|
Skyvern supports Gemini through two paths: the **Gemini API** (simpler, uses an API key) and **Vertex AI** (enterprise, uses a GCP service account).
|
|
|
|
### Gemini API
|
|
|
|
The quickest way to use Gemini. Get an API key from [Google AI Studio](https://aistudio.google.com/).
|
|
|
|
```bash .env
|
|
ENABLE_GEMINI=true
|
|
GEMINI_API_KEY=your-gemini-api-key
|
|
LLM_KEY=GEMINI_2.5_PRO
|
|
```
|
|
|
|
#### Available Gemini API models
|
|
|
|
| LLM_KEY | Notes |
|
|
|---------|-------|
|
|
| **Gemini 3** | |
|
|
| `GEMINI_3_PRO` | Recommended for primary use |
|
|
| `GEMINI_3.0_FLASH` | Recommended for secondary use |
|
|
| **Gemini 2.5** | |
|
|
| `GEMINI_2.5_PRO` | |
|
|
| `GEMINI_2.5_PRO_PREVIEW` | |
|
|
| `GEMINI_2.5_PRO_EXP_03_25` | Experimental |
|
|
| `GEMINI_2.5_FLASH` | |
|
|
| `GEMINI_2.5_FLASH_PREVIEW` | |
|
|
| **Gemini 2.0** | |
|
|
| `GEMINI_FLASH_2_0` | |
|
|
| `GEMINI_FLASH_2_0_LITE` | |
|
|
| **Gemini 1.5 Legacy** | |
|
|
| `GEMINI_PRO` | |
|
|
| `GEMINI_FLASH` | |
|
|
|
|
### Vertex AI
|
|
|
|
For enterprise deployments through [Vertex AI](https://cloud.google.com/vertex-ai). Requires a GCP project with Vertex AI enabled.
|
|
|
|
```bash .env
|
|
ENABLE_VERTEX_AI=true
|
|
LLM_KEY=VERTEX_GEMINI_3_PRO
|
|
GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
|
|
GCP_PROJECT_ID=your-gcp-project-id
|
|
VERTEX_LOCATION=us-central1
|
|
```
|
|
|
|
<Note>
|
|
If you're migrating from an older Skyvern version, `VERTEX_LOCATION` replaces the previous `GCP_REGION` variable. Update your `.env` accordingly.
|
|
</Note>
|
|
|
|
**Vertex AI setup steps:**
|
|
|
|
1. Create a [GCP project](https://console.cloud.google.com/) with billing enabled
|
|
2. Enable the **Vertex AI API** in your project
|
|
3. Create a service account with the **Vertex AI User** role
|
|
4. Download the service account JSON key file
|
|
5. Set `GOOGLE_APPLICATION_CREDENTIALS` to the path of that file
|
|
|
|
<Note>
|
|
For global endpoint access, set `VERTEX_LOCATION=global` and ensure `VERTEX_PROJECT_ID` is set. Not all models support the global endpoint.
|
|
</Note>
|
|
|
|
#### Available Vertex AI models
|
|
|
|
| LLM_KEY | Notes |
|
|
|---------|-------|
|
|
| **Gemini 3** | |
|
|
| `VERTEX_GEMINI_3_PRO` | Recommended for primary use |
|
|
| `VERTEX_GEMINI_3.0_FLASH` | Recommended for secondary use |
|
|
| **Gemini 2.5** | |
|
|
| `VERTEX_GEMINI_2.5_PRO` | |
|
|
| `VERTEX_GEMINI_2.5_PRO_PREVIEW` | |
|
|
| `VERTEX_GEMINI_2.5_FLASH` | |
|
|
| `VERTEX_GEMINI_2.5_FLASH_LITE` | |
|
|
| `VERTEX_GEMINI_2.5_FLASH_PREVIEW` | |
|
|
| **Gemini 2.0** | |
|
|
| `VERTEX_GEMINI_FLASH_2_0` | |
|
|
| **Gemini 1.5 Legacy** | |
|
|
| `VERTEX_GEMINI_PRO` | |
|
|
| `VERTEX_GEMINI_FLASH` | |
|
|
|
|
---
|
|
|
|
## Amazon Bedrock
|
|
|
|
Run Anthropic Claude through your AWS account.
|
|
|
|
```bash .env
|
|
ENABLE_BEDROCK=true
|
|
LLM_KEY=BEDROCK_ANTHROPIC_CLAUDE3.5_SONNET
|
|
AWS_REGION=us-west-2
|
|
AWS_ACCESS_KEY_ID=AKIA...
|
|
AWS_SECRET_ACCESS_KEY=...
|
|
```
|
|
|
|
### Setup steps
|
|
|
|
1. Create an IAM user with `AmazonBedrockFullAccess` policy
|
|
2. Generate access keys for the IAM user
|
|
3. In the [Bedrock console](https://console.aws.amazon.com/bedrock/), go to **Model Access**
|
|
4. Enable access to Claude 3.5 Sonnet
|
|
|
|
### Available models
|
|
|
|
| LLM_KEY | Notes |
|
|
|---------|-------|
|
|
| **Amazon Nova (AWS Native)** | |
|
|
| `BEDROCK_AMAZON_NOVA_PRO` | |
|
|
| `BEDROCK_AMAZON_NOVA_LITE` | |
|
|
| **Claude 4.6** | |
|
|
| `BEDROCK_ANTHROPIC_CLAUDE4.6_OPUS_INFERENCE_PROFILE` | Cross-region |
|
|
| **Claude 4.5** | |
|
|
| `BEDROCK_ANTHROPIC_CLAUDE4.5_OPUS_INFERENCE_PROFILE` | Cross-region |
|
|
| `BEDROCK_ANTHROPIC_CLAUDE4.5_SONNET_INFERENCE_PROFILE` | Cross-region |
|
|
| **Claude 4** | |
|
|
| `BEDROCK_ANTHROPIC_CLAUDE4_OPUS_INFERENCE_PROFILE` | Cross-region |
|
|
| `BEDROCK_ANTHROPIC_CLAUDE4_SONNET_INFERENCE_PROFILE` | Cross-region |
|
|
| **Claude 3.7** | |
|
|
| `BEDROCK_ANTHROPIC_CLAUDE3.7_SONNET_INFERENCE_PROFILE` | Cross-region |
|
|
| **Claude 3.5** | |
|
|
| `BEDROCK_ANTHROPIC_CLAUDE3.5_SONNET` | v2 |
|
|
| `BEDROCK_ANTHROPIC_CLAUDE3.5_SONNET_V1` | |
|
|
| `BEDROCK_ANTHROPIC_CLAUDE3.5_SONNET_INFERENCE_PROFILE` | Cross-region |
|
|
| `BEDROCK_ANTHROPIC_CLAUDE3.5_HAIKU` | |
|
|
| **Claude 3 (Legacy)** | |
|
|
| `BEDROCK_ANTHROPIC_CLAUDE3_OPUS` | |
|
|
| `BEDROCK_ANTHROPIC_CLAUDE3_SONNET` | |
|
|
| `BEDROCK_ANTHROPIC_CLAUDE3_HAIKU` | |
|
|
|
|
<Note>
|
|
Bedrock inference profile keys (`*_INFERENCE_PROFILE`) use cross-region inference and require `AWS_REGION` only. No access keys needed if running on an IAM-authenticated instance.
|
|
</Note>
|
|
|
|
---
|
|
|
|
## MiniMax
|
|
|
|
[MiniMax](https://www.minimax.io/) models with vision support.
|
|
|
|
```bash .env
|
|
ENABLE_MINIMAX=true
|
|
MINIMAX_API_KEY=your-minimax-api-key
|
|
LLM_KEY=MINIMAX_M2_5
|
|
```
|
|
|
|
### Available models
|
|
|
|
| LLM_KEY | Notes |
|
|
|---------|-------|
|
|
| `MINIMAX_M2_5` | |
|
|
| `MINIMAX_M2_5_HIGHSPEED` | Faster variant |
|
|
|
|
### Optional settings
|
|
|
|
```bash .env
|
|
# Use a custom API endpoint
|
|
MINIMAX_API_BASE=https://api.minimax.io/v1
|
|
```
|
|
|
|
---
|
|
|
|
## VolcEngine (ByteDance Doubao)
|
|
|
|
[VolcEngine](https://www.volcengine.com/) provides access to ByteDance's Doubao models with vision support.
|
|
|
|
```bash .env
|
|
ENABLE_VOLCENGINE=true
|
|
VOLCENGINE_API_KEY=your-volcengine-api-key
|
|
LLM_KEY=VOLCENGINE_DOUBAO_SEED_1_6
|
|
```
|
|
|
|
### Available models
|
|
|
|
| LLM_KEY | Notes |
|
|
|---------|-------|
|
|
| `VOLCENGINE_DOUBAO_SEED_1_6` | Recommended for general use |
|
|
| `VOLCENGINE_DOUBAO_SEED_1_6_FLASH` | Faster variant |
|
|
| `VOLCENGINE_DOUBAO_1_5_THINKING_VISION_PRO` | Reasoning model |
|
|
|
|
### Optional settings
|
|
|
|
```bash .env
|
|
# Use a custom API endpoint
|
|
VOLCENGINE_API_BASE=https://ark.cn-beijing.volces.com/api/v3
|
|
```
|
|
|
|
---
|
|
|
|
## Novita
|
|
|
|
[Novita AI](https://novita.ai/) provides access to DeepSeek, Llama, and other open-source models.
|
|
|
|
```bash .env
|
|
ENABLE_NOVITA=true
|
|
NOVITA_API_KEY=your-novita-api-key
|
|
LLM_KEY=NOVITA_LLAMA_3_2_11B_VISION
|
|
```
|
|
|
|
### Available models
|
|
|
|
| LLM_KEY | Notes |
|
|
|---------|-------|
|
|
| **DeepSeek** | |
|
|
| `NOVITA_DEEPSEEK_R1` | Reasoning model |
|
|
| `NOVITA_DEEPSEEK_V3` | |
|
|
| **Llama 3.3** | |
|
|
| `NOVITA_LLAMA_3_3_70B` | |
|
|
| **Llama 3.2** | |
|
|
| `NOVITA_LLAMA_3_2_11B_VISION` | Vision support |
|
|
| `NOVITA_LLAMA_3_2_3B` | |
|
|
| `NOVITA_LLAMA_3_2_1B` | |
|
|
| **Llama 3.1** | |
|
|
| `NOVITA_LLAMA_3_1_405B` | |
|
|
| `NOVITA_LLAMA_3_1_70B` | |
|
|
| `NOVITA_LLAMA_3_1_8B` | |
|
|
| **Llama 3** | |
|
|
| `NOVITA_LLAMA_3_70B` | |
|
|
| `NOVITA_LLAMA_3_8B` | |
|
|
|
|
---
|
|
|
|
## Moonshot
|
|
|
|
[Moonshot AI](https://www.moonshot.cn/) provides the Kimi series models with long context support.
|
|
|
|
```bash .env
|
|
ENABLE_MOONSHOT=true
|
|
MOONSHOT_API_KEY=your-moonshot-api-key
|
|
LLM_KEY=MOONSHOT_KIMI_K2
|
|
```
|
|
|
|
### Available models
|
|
|
|
| LLM_KEY | Notes |
|
|
|---------|-------|
|
|
| `MOONSHOT_KIMI_K2` | |
|
|
|
|
### Optional settings
|
|
|
|
```bash .env
|
|
# Use a custom API endpoint
|
|
MOONSHOT_API_BASE=https://api.moonshot.cn/v1
|
|
```
|
|
|
|
---
|
|
|
|
## Inception
|
|
|
|
[Inception AI](https://inception.ai/) provides the Mercury series models.
|
|
|
|
```bash .env
|
|
ENABLE_INCEPTION=true
|
|
INCEPTION_API_KEY=your-inception-api-key
|
|
LLM_KEY=INCEPTION_MERCURY_2
|
|
```
|
|
|
|
### Available models
|
|
|
|
| LLM_KEY | Notes |
|
|
|---------|-------|
|
|
| `INCEPTION_MERCURY_2` | |
|
|
|
|
### Optional settings
|
|
|
|
```bash .env
|
|
# Use a custom API endpoint
|
|
INCEPTION_API_BASE=https://api.inception.ai/v1
|
|
```
|
|
|
|
---
|
|
|
|
## Ollama (Local Models)
|
|
|
|
Run open-source models locally with [Ollama](https://ollama.ai/). No API costs, but requires sufficient local compute.
|
|
|
|
```bash .env
|
|
ENABLE_OLLAMA=true
|
|
LLM_KEY=OLLAMA
|
|
OLLAMA_MODEL=llama3.1
|
|
OLLAMA_SERVER_URL=http://host.docker.internal:11434
|
|
OLLAMA_SUPPORTS_VISION=false
|
|
```
|
|
|
|
### Setup steps
|
|
|
|
1. [Install Ollama](https://ollama.ai/download)
|
|
2. Pull a model: `ollama pull llama3.1`
|
|
3. Start Ollama: `ollama serve`
|
|
4. Configure Skyvern to connect
|
|
|
|
<Warning>
|
|
Most Ollama models don't support vision. Set `OLLAMA_SUPPORTS_VISION=false`. Without vision, Skyvern relies on DOM analysis instead of screenshot analysis, which may reduce accuracy on complex pages.
|
|
</Warning>
|
|
|
|
### Docker networking
|
|
|
|
When running Skyvern in Docker and Ollama on the host:
|
|
|
|
| Host OS | OLLAMA_SERVER_URL |
|
|
|---------|-------------------|
|
|
| macOS/Windows | `http://host.docker.internal:11434` |
|
|
| Linux | `http://172.17.0.1:11434` (Docker bridge IP) |
|
|
|
|
---
|
|
|
|
## OpenAI-Compatible Endpoints
|
|
|
|
Connect to any service that implements the OpenAI API format, including LiteLLM, LocalAI, vLLM, and text-generation-inference.
|
|
|
|
```bash .env
|
|
ENABLE_OPENAI_COMPATIBLE=true
|
|
OPENAI_COMPATIBLE_MODEL_NAME=llama3.1
|
|
OPENAI_COMPATIBLE_API_KEY=sk-test
|
|
OPENAI_COMPATIBLE_API_BASE=http://localhost:4000/v1
|
|
LLM_KEY=OPENAI_COMPATIBLE
|
|
```
|
|
|
|
This is useful for:
|
|
- Running local models with a unified API
|
|
- Using LiteLLM as a proxy to switch between providers
|
|
- Connecting to self-hosted inference servers
|
|
|
|
---
|
|
|
|
## OpenRouter
|
|
|
|
Access multiple models through a single API at [openrouter.ai](https://openrouter.ai/).
|
|
|
|
```bash .env
|
|
ENABLE_OPENROUTER=true
|
|
LLM_KEY=OPENROUTER
|
|
OPENROUTER_API_KEY=sk-or-...
|
|
OPENROUTER_MODEL=mistralai/mistral-small-3.1-24b-instruct
|
|
```
|
|
|
|
---
|
|
|
|
## Groq
|
|
|
|
Inference on open-source models at [groq.com](https://groq.com/).
|
|
|
|
```bash .env
|
|
ENABLE_GROQ=true
|
|
LLM_KEY=GROQ
|
|
GROQ_API_KEY=gsk_...
|
|
GROQ_MODEL=llama-3.1-8b-instant
|
|
```
|
|
|
|
<Note>
|
|
Groq specializes in fast inference for open-source models. Response times are typically much faster than other providers, but model selection is limited.
|
|
</Note>
|
|
|
|
---
|
|
|
|
## Using multiple models
|
|
|
|
### Primary and secondary models
|
|
|
|
Configure a cheaper model for lightweight operations:
|
|
|
|
```bash .env
|
|
# Main model for complex decisions
|
|
LLM_KEY=ANTHROPIC_CLAUDE4.5_OPUS
|
|
# or: OPENAI_GPT5
|
|
# or: GEMINI_3_PRO
|
|
|
|
# Faster model for simple tasks like dropdown selection
|
|
SECONDARY_LLM_KEY=ANTHROPIC_CLAUDE4.5_SONNET
|
|
# or: OPENAI_GPT5_MINI
|
|
# or: GEMINI_3.0_FLASH
|
|
```
|
|
|
|
<Tip>
|
|
**Recommended primary models (latest):**
|
|
- **Anthropic Claude 4.5 Opus** (`ANTHROPIC_CLAUDE4.5_OPUS`) - Most capable
|
|
- **OpenAI GPT-5** (`OPENAI_GPT5`) - Latest
|
|
- **Google Gemini 3 Pro** (`GEMINI_3_PRO`) - Latest
|
|
|
|
**Recommended secondary models (latest):**
|
|
- **Claude 4.5 Sonnet** (`ANTHROPIC_CLAUDE4.5_SONNET`) - Balanced
|
|
- **GPT-5 Mini** (`OPENAI_GPT5_MINI`) - Faster GPT-5
|
|
- **Gemini 3.0 Flash** (`GEMINI_3.0_FLASH`) - Faster Gemini 3
|
|
</Tip>
|
|
|
|
### Task-specific models
|
|
|
|
For fine-grained control, you can override models for specific operations:
|
|
|
|
```bash .env
|
|
# Model for data extraction from pages (defaults to LLM_KEY if not set)
|
|
EXTRACTION_LLM_KEY=ANTHROPIC_CLAUDE4.5_SONNET
|
|
|
|
# Model for generating code/scripts in code blocks (defaults to LLM_KEY if not set)
|
|
SCRIPT_GENERATION_LLM_KEY=OPENAI_GPT5
|
|
```
|
|
|
|
Most deployments don't need task-specific models. Start with `LLM_KEY` and `SECONDARY_LLM_KEY`.
|
|
|
|
---
|
|
|
|
## Troubleshooting
|
|
|
|
### "To enable svg shape conversion, please set the Secondary LLM key"
|
|
|
|
Some operations require a secondary model. Set `SECONDARY_LLM_KEY` in your environment:
|
|
|
|
```bash .env
|
|
SECONDARY_LLM_KEY=OPENAI_GPT4O_MINI
|
|
```
|
|
|
|
### "Context window exceeded"
|
|
|
|
The page content is too large for the model's context window. Options:
|
|
- Use a model with larger context support (GPT-5, Gemini 2.5 Pro, or Claude 4.5 Sonnet)
|
|
- Simplify your prompt to require less page analysis
|
|
- Start from a more specific URL with less content
|
|
|
|
### "LLM caller not found"
|
|
|
|
The configured `LLM_KEY` doesn't match any enabled provider. Verify:
|
|
1. The provider is enabled (`ENABLE_OPENAI=true`, etc.)
|
|
2. The `LLM_KEY` value matches a supported model name exactly
|
|
3. Model names are case-sensitive: `OPENAI_GPT4O` not `openai_gpt4o`
|
|
|
|
### Container logs show authentication errors
|
|
|
|
Check your API key configuration:
|
|
- Ensure the key is set correctly without extra whitespace
|
|
- Verify the key hasn't expired or been revoked
|
|
- For Azure, ensure `AZURE_API_BASE` includes the full URL with `https://`
|
|
|
|
### Slow response times
|
|
|
|
LLM calls typically take 2-10 seconds. Longer times may indicate:
|
|
- Network latency to the provider
|
|
- Rate limiting (the provider may be throttling requests)
|
|
- For Ollama, insufficient local compute resources
|
|
|
|
---
|
|
|
|
## Next steps
|
|
|
|
<CardGroup cols={2}>
|
|
<Card title="Browser Configuration" icon="window" href="/developers/self-hosted/browser">
|
|
Configure browser modes, locales, and display settings
|
|
</Card>
|
|
<Card title="Docker Setup" icon="docker" href="/developers/self-hosted/docker">
|
|
Return to the main Docker setup guide
|
|
</Card>
|
|
</CardGroup>
|