--- title: LLM Configuration subtitle: Connect your preferred language model provider slug: self-hosted/llm-configuration --- Skyvern uses LLMs to analyze screenshots and decide what actions to take. You'll need to configure at least one LLM provider before running tasks. ## How Skyvern uses LLMs Skyvern makes multiple LLM calls per task step: 1. **Screenshot analysis**: Identify interactive elements on the page 2. **Action planning**: Decide what to click, type, or extract 3. **Result extraction**: Parse data from the page into structured output A task that runs for 10 steps makes roughly 30+ LLM calls. Choose your provider and model tier with this in mind. For most deployments, configure a single provider using `LLM_KEY`. Skyvern also supports a `SECONDARY_LLM_KEY` for lighter tasks to reduce costs. --- ## Quick Start Recommendations **Best models for production (2025):** | Provider | Primary Model | Secondary Model | Notes | |----------|--------------|-----------------|-------| | **Anthropic** | `ANTHROPIC_CLAUDE4.5_OPUS` | `ANTHROPIC_CLAUDE4.5_SONNET` | Most capable | | **OpenAI** | `OPENAI_GPT5` | `OPENAI_GPT5_MINI` | Latest | | **Google** | `GEMINI_3_PRO` | `GEMINI_3.0_FLASH` | Latest | | **AWS Bedrock** | `BEDROCK_ANTHROPIC_CLAUDE4.5_OPUS_INFERENCE_PROFILE` | `BEDROCK_ANTHROPIC_CLAUDE4.5_SONNET_INFERENCE_PROFILE` | Latest Claude | **New in 2025:** GPT-5 series, Claude 4.6 Opus, Gemini 3, Amazon Nova, and many new open-source models via Novita and VolcEngine. --- ## OpenAI The most common choice. Requires an API key from [platform.openai.com](https://platform.openai.com/). ```bash .env ENABLE_OPENAI=true OPENAI_API_KEY=sk-... LLM_KEY=OPENAI_GPT4O ``` ### Available models | LLM_KEY | Notes | |---------|-------| | **GPT-5 Series** | | | `OPENAI_GPT5` | Recommended for most complex tasks | | `OPENAI_GPT5_MINI` | | | `OPENAI_GPT5_MINI_FLEX` | Flex service tier, 15min timeout | | `OPENAI_GPT5_NANO` | | | `OPENAI_GPT5_1` | | | `OPENAI_GPT5_2` | | | `OPENAI_GPT5_4` | | | **GPT-4 Series** | | | `OPENAI_GPT4O` | | | `OPENAI_GPT4O_MINI` | | | `OPENAI_GPT4_1` | | | `OPENAI_GPT4_1_MINI` | | | `OPENAI_GPT4_1_NANO` | | | `OPENAI_GPT4_5` | | | `OPENAI_GPT4_TURBO` | Legacy | | `OPENAI_GPT4V` | Legacy alias | | **O-Series (Reasoning)** | | | `OPENAI_O4_MINI` | Vision support | | `OPENAI_O3` | Vision support | | `OPENAI_O3_MINI` | No vision | ### Optional settings ```bash .env # Use a custom API endpoint (for proxies or compatible services) OPENAI_API_BASE=https://your-proxy.com/v1 # Specify organization ID OPENAI_ORGANIZATION=org-... ``` --- ## Anthropic Claude models from [anthropic.com](https://www.anthropic.com/). ```bash .env ENABLE_ANTHROPIC=true ANTHROPIC_API_KEY=sk-ant-... LLM_KEY=ANTHROPIC_CLAUDE3.5_SONNET ``` ### Available models | LLM_KEY | Notes | |---------|-------| | **Claude 4.6** | | | `ANTHROPIC_CLAUDE4.6_OPUS` | Newest | | **Claude 4.5** | | | `ANTHROPIC_CLAUDE4.5_OPUS` | Recommended for primary use | | `ANTHROPIC_CLAUDE4.5_SONNET` | Recommended for secondary use | | `ANTHROPIC_CLAUDE4.5_HAIKU` | Fastest | | **Claude 4** | | | `ANTHROPIC_CLAUDE4_OPUS` | | | `ANTHROPIC_CLAUDE4_SONNET` | | | **Claude 3.7** | | | `ANTHROPIC_CLAUDE3.7_SONNET` | | | **Claude 3.5** | | | `ANTHROPIC_CLAUDE3.5_SONNET` | | | `ANTHROPIC_CLAUDE3.5_HAIKU` | | | **Claude 3 (Legacy)** | | | `ANTHROPIC_CLAUDE3_OPUS` | | | `ANTHROPIC_CLAUDE3_SONNET` | | | `ANTHROPIC_CLAUDE3_HAIKU` | | --- ## Azure OpenAI Microsoft-hosted OpenAI models. Requires an Azure subscription with OpenAI service provisioned. ```bash .env ENABLE_AZURE=true LLM_KEY=AZURE_OPENAI AZURE_DEPLOYMENT=your-deployment-name AZURE_API_KEY=your-azure-api-key AZURE_API_BASE=https://your-resource.openai.azure.com/ AZURE_API_VERSION=2024-08-01-preview ``` ### Setup steps 1. Create an Azure OpenAI resource in the [Azure Portal](https://portal.azure.com) 2. Open the Azure AI Foundry portal from your resource's overview page 3. Go to **Shared Resources** → **Deployments** 4. Click **Deploy Model** → **Deploy Base Model** → select GPT-4o or GPT-4 5. Note the **Deployment Name**. Use this for `AZURE_DEPLOYMENT` 6. Copy your API key and endpoint from the Azure Portal The `AZURE_DEPLOYMENT` is the name you chose when deploying the model, not the model name itself. --- ## Google Gemini Skyvern supports Gemini through two paths: the **Gemini API** (simpler, uses an API key) and **Vertex AI** (enterprise, uses a GCP service account). ### Gemini API The quickest way to use Gemini. Get an API key from [Google AI Studio](https://aistudio.google.com/). ```bash .env ENABLE_GEMINI=true GEMINI_API_KEY=your-gemini-api-key LLM_KEY=GEMINI_2.5_PRO ``` #### Available Gemini API models | LLM_KEY | Notes | |---------|-------| | **Gemini 3** | | | `GEMINI_3_PRO` | Recommended for primary use | | `GEMINI_3.0_FLASH` | Recommended for secondary use | | **Gemini 2.5** | | | `GEMINI_2.5_PRO` | | | `GEMINI_2.5_PRO_PREVIEW` | | | `GEMINI_2.5_PRO_EXP_03_25` | Experimental | | `GEMINI_2.5_FLASH` | | | `GEMINI_2.5_FLASH_PREVIEW` | | | **Gemini 2.0** | | | `GEMINI_FLASH_2_0` | | | `GEMINI_FLASH_2_0_LITE` | | | **Gemini 1.5 Legacy** | | | `GEMINI_PRO` | | | `GEMINI_FLASH` | | ### Vertex AI For enterprise deployments through [Vertex AI](https://cloud.google.com/vertex-ai). Requires a GCP project with Vertex AI enabled. ```bash .env ENABLE_VERTEX_AI=true LLM_KEY=VERTEX_GEMINI_3_PRO GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json GCP_PROJECT_ID=your-gcp-project-id VERTEX_LOCATION=us-central1 ``` If you're migrating from an older Skyvern version, `VERTEX_LOCATION` replaces the previous `GCP_REGION` variable. Update your `.env` accordingly. **Vertex AI setup steps:** 1. Create a [GCP project](https://console.cloud.google.com/) with billing enabled 2. Enable the **Vertex AI API** in your project 3. Create a service account with the **Vertex AI User** role 4. Download the service account JSON key file 5. Set `GOOGLE_APPLICATION_CREDENTIALS` to the path of that file For global endpoint access, set `VERTEX_LOCATION=global` and ensure `VERTEX_PROJECT_ID` is set. Not all models support the global endpoint. #### Available Vertex AI models | LLM_KEY | Notes | |---------|-------| | **Gemini 3** | | | `VERTEX_GEMINI_3_PRO` | Recommended for primary use | | `VERTEX_GEMINI_3.0_FLASH` | Recommended for secondary use | | **Gemini 2.5** | | | `VERTEX_GEMINI_2.5_PRO` | | | `VERTEX_GEMINI_2.5_PRO_PREVIEW` | | | `VERTEX_GEMINI_2.5_FLASH` | | | `VERTEX_GEMINI_2.5_FLASH_LITE` | | | `VERTEX_GEMINI_2.5_FLASH_PREVIEW` | | | **Gemini 2.0** | | | `VERTEX_GEMINI_FLASH_2_0` | | | **Gemini 1.5 Legacy** | | | `VERTEX_GEMINI_PRO` | | | `VERTEX_GEMINI_FLASH` | | --- ## Amazon Bedrock Run Anthropic Claude through your AWS account. ```bash .env ENABLE_BEDROCK=true LLM_KEY=BEDROCK_ANTHROPIC_CLAUDE3.5_SONNET AWS_REGION=us-west-2 AWS_ACCESS_KEY_ID=AKIA... AWS_SECRET_ACCESS_KEY=... ``` ### Setup steps 1. Create an IAM user with `AmazonBedrockFullAccess` policy 2. Generate access keys for the IAM user 3. In the [Bedrock console](https://console.aws.amazon.com/bedrock/), go to **Model Access** 4. Enable access to Claude 3.5 Sonnet ### Available models | LLM_KEY | Notes | |---------|-------| | **Amazon Nova (AWS Native)** | | | `BEDROCK_AMAZON_NOVA_PRO` | | | `BEDROCK_AMAZON_NOVA_LITE` | | | **Claude 4.6** | | | `BEDROCK_ANTHROPIC_CLAUDE4.6_OPUS_INFERENCE_PROFILE` | Cross-region | | **Claude 4.5** | | | `BEDROCK_ANTHROPIC_CLAUDE4.5_OPUS_INFERENCE_PROFILE` | Cross-region | | `BEDROCK_ANTHROPIC_CLAUDE4.5_SONNET_INFERENCE_PROFILE` | Cross-region | | **Claude 4** | | | `BEDROCK_ANTHROPIC_CLAUDE4_OPUS_INFERENCE_PROFILE` | Cross-region | | `BEDROCK_ANTHROPIC_CLAUDE4_SONNET_INFERENCE_PROFILE` | Cross-region | | **Claude 3.7** | | | `BEDROCK_ANTHROPIC_CLAUDE3.7_SONNET_INFERENCE_PROFILE` | Cross-region | | **Claude 3.5** | | | `BEDROCK_ANTHROPIC_CLAUDE3.5_SONNET` | v2 | | `BEDROCK_ANTHROPIC_CLAUDE3.5_SONNET_V1` | | | `BEDROCK_ANTHROPIC_CLAUDE3.5_SONNET_INFERENCE_PROFILE` | Cross-region | | `BEDROCK_ANTHROPIC_CLAUDE3.5_HAIKU` | | | **Claude 3 (Legacy)** | | | `BEDROCK_ANTHROPIC_CLAUDE3_OPUS` | | | `BEDROCK_ANTHROPIC_CLAUDE3_SONNET` | | | `BEDROCK_ANTHROPIC_CLAUDE3_HAIKU` | | Bedrock inference profile keys (`*_INFERENCE_PROFILE`) use cross-region inference and require `AWS_REGION` only. No access keys needed if running on an IAM-authenticated instance. --- ## MiniMax [MiniMax](https://www.minimax.io/) models with vision support. ```bash .env ENABLE_MINIMAX=true MINIMAX_API_KEY=your-minimax-api-key LLM_KEY=MINIMAX_M2_5 ``` ### Available models | LLM_KEY | Notes | |---------|-------| | `MINIMAX_M2_5` | | | `MINIMAX_M2_5_HIGHSPEED` | Faster variant | ### Optional settings ```bash .env # Use a custom API endpoint MINIMAX_API_BASE=https://api.minimax.io/v1 ``` --- ## VolcEngine (ByteDance Doubao) [VolcEngine](https://www.volcengine.com/) provides access to ByteDance's Doubao models with vision support. ```bash .env ENABLE_VOLCENGINE=true VOLCENGINE_API_KEY=your-volcengine-api-key LLM_KEY=VOLCENGINE_DOUBAO_SEED_1_6 ``` ### Available models | LLM_KEY | Notes | |---------|-------| | `VOLCENGINE_DOUBAO_SEED_1_6` | Recommended for general use | | `VOLCENGINE_DOUBAO_SEED_1_6_FLASH` | Faster variant | | `VOLCENGINE_DOUBAO_1_5_THINKING_VISION_PRO` | Reasoning model | ### Optional settings ```bash .env # Use a custom API endpoint VOLCENGINE_API_BASE=https://ark.cn-beijing.volces.com/api/v3 ``` --- ## Novita [Novita AI](https://novita.ai/) provides access to DeepSeek, Llama, and other open-source models. ```bash .env ENABLE_NOVITA=true NOVITA_API_KEY=your-novita-api-key LLM_KEY=NOVITA_LLAMA_3_2_11B_VISION ``` ### Available models | LLM_KEY | Notes | |---------|-------| | **DeepSeek** | | | `NOVITA_DEEPSEEK_R1` | Reasoning model | | `NOVITA_DEEPSEEK_V3` | | | **Llama 3.3** | | | `NOVITA_LLAMA_3_3_70B` | | | **Llama 3.2** | | | `NOVITA_LLAMA_3_2_11B_VISION` | Vision support | | `NOVITA_LLAMA_3_2_3B` | | | `NOVITA_LLAMA_3_2_1B` | | | **Llama 3.1** | | | `NOVITA_LLAMA_3_1_405B` | | | `NOVITA_LLAMA_3_1_70B` | | | `NOVITA_LLAMA_3_1_8B` | | | **Llama 3** | | | `NOVITA_LLAMA_3_70B` | | | `NOVITA_LLAMA_3_8B` | | --- ## Moonshot [Moonshot AI](https://www.moonshot.cn/) provides the Kimi series models with long context support. ```bash .env ENABLE_MOONSHOT=true MOONSHOT_API_KEY=your-moonshot-api-key LLM_KEY=MOONSHOT_KIMI_K2 ``` ### Available models | LLM_KEY | Notes | |---------|-------| | `MOONSHOT_KIMI_K2` | | ### Optional settings ```bash .env # Use a custom API endpoint MOONSHOT_API_BASE=https://api.moonshot.cn/v1 ``` --- ## Inception [Inception AI](https://inception.ai/) provides the Mercury series models. ```bash .env ENABLE_INCEPTION=true INCEPTION_API_KEY=your-inception-api-key LLM_KEY=INCEPTION_MERCURY_2 ``` ### Available models | LLM_KEY | Notes | |---------|-------| | `INCEPTION_MERCURY_2` | | ### Optional settings ```bash .env # Use a custom API endpoint INCEPTION_API_BASE=https://api.inception.ai/v1 ``` --- ## Ollama (Local Models) Run open-source models locally with [Ollama](https://ollama.ai/). No API costs, but requires sufficient local compute. ```bash .env ENABLE_OLLAMA=true LLM_KEY=OLLAMA OLLAMA_MODEL=llama3.1 OLLAMA_SERVER_URL=http://host.docker.internal:11434 OLLAMA_SUPPORTS_VISION=false ``` ### Setup steps 1. [Install Ollama](https://ollama.ai/download) 2. Pull a model: `ollama pull llama3.1` 3. Start Ollama: `ollama serve` 4. Configure Skyvern to connect Most Ollama models don't support vision. Set `OLLAMA_SUPPORTS_VISION=false`. Without vision, Skyvern relies on DOM analysis instead of screenshot analysis, which may reduce accuracy on complex pages. ### Docker networking When running Skyvern in Docker and Ollama on the host: | Host OS | OLLAMA_SERVER_URL | |---------|-------------------| | macOS/Windows | `http://host.docker.internal:11434` | | Linux | `http://172.17.0.1:11434` (Docker bridge IP) | --- ## OpenAI-Compatible Endpoints Connect to any service that implements the OpenAI API format, including LiteLLM, LocalAI, vLLM, and text-generation-inference. ```bash .env ENABLE_OPENAI_COMPATIBLE=true OPENAI_COMPATIBLE_MODEL_NAME=llama3.1 OPENAI_COMPATIBLE_API_KEY=sk-test OPENAI_COMPATIBLE_API_BASE=http://localhost:4000/v1 LLM_KEY=OPENAI_COMPATIBLE ``` This is useful for: - Running local models with a unified API - Using LiteLLM as a proxy to switch between providers - Connecting to self-hosted inference servers --- ## OpenRouter Access multiple models through a single API at [openrouter.ai](https://openrouter.ai/). ```bash .env ENABLE_OPENROUTER=true LLM_KEY=OPENROUTER OPENROUTER_API_KEY=sk-or-... OPENROUTER_MODEL=mistralai/mistral-small-3.1-24b-instruct ``` --- ## Groq Inference on open-source models at [groq.com](https://groq.com/). ```bash .env ENABLE_GROQ=true LLM_KEY=GROQ GROQ_API_KEY=gsk_... GROQ_MODEL=llama-3.1-8b-instant ``` Groq specializes in fast inference for open-source models. Response times are typically much faster than other providers, but model selection is limited. --- ## Using multiple models ### Primary and secondary models Configure a cheaper model for lightweight operations: ```bash .env # Main model for complex decisions LLM_KEY=ANTHROPIC_CLAUDE4.5_OPUS # or: OPENAI_GPT5 # or: GEMINI_3_PRO # Faster model for simple tasks like dropdown selection SECONDARY_LLM_KEY=ANTHROPIC_CLAUDE4.5_SONNET # or: OPENAI_GPT5_MINI # or: GEMINI_3.0_FLASH ``` **Recommended primary models (latest):** - **Anthropic Claude 4.5 Opus** (`ANTHROPIC_CLAUDE4.5_OPUS`) - Most capable - **OpenAI GPT-5** (`OPENAI_GPT5`) - Latest - **Google Gemini 3 Pro** (`GEMINI_3_PRO`) - Latest **Recommended secondary models (latest):** - **Claude 4.5 Sonnet** (`ANTHROPIC_CLAUDE4.5_SONNET`) - Balanced - **GPT-5 Mini** (`OPENAI_GPT5_MINI`) - Faster GPT-5 - **Gemini 3.0 Flash** (`GEMINI_3.0_FLASH`) - Faster Gemini 3 ### Task-specific models For fine-grained control, you can override models for specific operations: ```bash .env # Model for data extraction from pages (defaults to LLM_KEY if not set) EXTRACTION_LLM_KEY=ANTHROPIC_CLAUDE4.5_SONNET # Model for generating code/scripts in code blocks (defaults to LLM_KEY if not set) SCRIPT_GENERATION_LLM_KEY=OPENAI_GPT5 ``` Most deployments don't need task-specific models. Start with `LLM_KEY` and `SECONDARY_LLM_KEY`. --- ## Troubleshooting ### "To enable svg shape conversion, please set the Secondary LLM key" Some operations require a secondary model. Set `SECONDARY_LLM_KEY` in your environment: ```bash .env SECONDARY_LLM_KEY=OPENAI_GPT4O_MINI ``` ### "Context window exceeded" The page content is too large for the model's context window. Options: - Use a model with larger context support (GPT-5, Gemini 2.5 Pro, or Claude 4.5 Sonnet) - Simplify your prompt to require less page analysis - Start from a more specific URL with less content ### "LLM caller not found" The configured `LLM_KEY` doesn't match any enabled provider. Verify: 1. The provider is enabled (`ENABLE_OPENAI=true`, etc.) 2. The `LLM_KEY` value matches a supported model name exactly 3. Model names are case-sensitive: `OPENAI_GPT4O` not `openai_gpt4o` ### Container logs show authentication errors Check your API key configuration: - Ensure the key is set correctly without extra whitespace - Verify the key hasn't expired or been revoked - For Azure, ensure `AZURE_API_BASE` includes the full URL with `https://` ### Slow response times LLM calls typically take 2-10 seconds. Longer times may indicate: - Network latency to the provider - Rate limiting (the provider may be throttling requests) - For Ollama, insufficient local compute resources --- ## Next steps Configure browser modes, locales, and display settings Return to the main Docker setup guide