goose/documentation/docs/getting-started/providers.md

37 KiB

sidebar_position title
2 Configure LLM Provider

import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem'; import { PanelLeft } from 'lucide-react'; import { ModelSelectionTip } from '@site/src/components/ModelSelectionTip';

Supported LLM Providers

Goose is compatible with a wide range of LLM providers, allowing you to choose and integrate your preferred model.

:::tip Model Selection [Berkeley Function-Calling Leaderboard][function-calling-leaderboard] can be a good guide for selecting models. :::

Available Providers

Provider Description Parameters
Amazon Bedrock Offers a variety of foundation models, including Claude, Jurassic-2, and others. AWS environment variables must be set in advance, not configured through goose configure AWS_PROFILE, or AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_REGION
Amazon SageMaker TGI Run Text Generation Inference models through Amazon SageMaker endpoints. AWS credentials must be configured in advance. SAGEMAKER_ENDPOINT_NAME, AWS_REGION (optional), AWS_PROFILE (optional)
Anthropic Offers Claude, an advanced AI model for natural language tasks. ANTHROPIC_API_KEY, ANTHROPIC_HOST (optional)
Azure OpenAI Access Azure-hosted OpenAI models, including GPT-4 and GPT-3.5. Supports both API key and Azure credential chain authentication. AZURE_OPENAI_ENDPOINT, AZURE_OPENAI_DEPLOYMENT_NAME, AZURE_OPENAI_API_KEY (optional)
Databricks Unified data analytics and AI platform for building and deploying models. DATABRICKS_HOST, DATABRICKS_TOKEN
Docker Model Runner Local models running in Docker Desktop or Docker CE with OpenAI-compatible API endpoints. Because this provider runs locally, you must first download a model. OPENAI_HOST, OPENAI_BASE_PATH
Gemini Advanced LLMs by Google with multimodal capabilities (text, images). GOOGLE_API_KEY
GCP Vertex AI Google Cloud's Vertex AI platform, supporting Gemini and Claude models. Credentials must be configured in advance. GCP_PROJECT_ID, GCP_LOCATION and optionally GCP_MAX_RATE_LIMIT_RETRIES (5), GCP_MAX_OVERLOADED_RETRIES (5), GCP_INITIAL_RETRY_INTERVAL_MS (5000), GCP_BACKOFF_MULTIPLIER (2.0), GCP_MAX_RETRY_INTERVAL_MS (320_000).
GitHub Copilot Access to AI models from OpenAI, Anthropic, Google, and other providers through GitHub's Copilot infrastructure. GitHub account with Copilot access required. No manual key. Must configure through the CLI using the GitHub authentication flow to enable both CLI and Desktop access.
Groq High-performance inference hardware and tools for LLMs. GROQ_API_KEY
LiteLLM LiteLLM proxy supporting multiple models with automatic prompt caching and unified API access. LITELLM_HOST, LITELLM_BASE_PATH (optional), LITELLM_API_KEY (optional), LITELLM_CUSTOM_HEADERS (optional), LITELLM_TIMEOUT (optional)
Ollama Local model runner supporting Qwen, Llama, DeepSeek, and other open-source models. Because this provider runs locally, you must first download and run a model. OLLAMA_HOST
Ramalama Local model using native OCI container runtimes, CNCF tools, and supporting models as OCI artifacts. Ramalama API an compatible alternative to Ollama and can be used with the Goose Ollama provider. Supports Qwen, Llama, DeepSeek, and other open-source models. Because this provider runs locally, you must first download and run a model. OLLAMA_HOST
OpenAI Provides gpt-4o, o1, and other advanced language models. Also supports OpenAI-compatible endpoints (e.g., self-hosted LLaMA, vLLM, KServe). o1-mini and o1-preview are not supported because Goose uses tool calling. OPENAI_API_KEY, OPENAI_HOST (optional), OPENAI_ORGANIZATION (optional), OPENAI_PROJECT (optional), OPENAI_CUSTOM_HEADERS (optional)
OpenRouter API gateway for unified access to various models with features like rate-limiting management. OPENROUTER_API_KEY
Snowflake Access the latest models using Snowflake Cortex services, including Claude models. Requires a Snowflake account and programmatic access token (PAT). SNOWFLAKE_HOST, SNOWFLAKE_TOKEN
Tetrate Agent Router Service Unified API gateway for AI models including Claude, Gemini, GPT, open-weight models, and others. Supports PKCE authentication flow for secure API key generation. TETRATE_API_KEY, TETRATE_HOST (optional)
Venice AI Provides access to open source models like Llama, Mistral, and Qwen while prioritizing user privacy. Requires an account and an API key. VENICE_API_KEY, VENICE_HOST (optional), VENICE_BASE_PATH (optional), VENICE_MODELS_PATH (optional)
xAI Access to xAI's Grok models including grok-3, grok-3-mini, and grok-3-fast with 131,072 token context window. XAI_API_KEY, XAI_HOST (optional)

CLI Providers

Goose also supports special "pass-through" providers that work with existing CLI tools, allowing you to use your subscriptions instead of paying per token:

Provider Description Requirements
Claude Code (claude-code) Uses Anthropic's Claude CLI tool with your Claude Code subscription. Provides access to Claude with 200K context limit. Claude CLI installed and authenticated, active Claude Code subscription
Cursor Agent (cursor-agent) Uses Cursor's AI CLI tool with your Cursor subscription. Provides access to GPT-5, Claude 4, and other models through the cursor-agent command-line interface. cursor-agent CLI installed and authenticated
Gemini CLI (gemini-cli) Uses Google's Gemini CLI tool with your Google AI subscription. Provides access to Gemini with 1M context limit. Gemini CLI installed and authenticated

:::tip CLI Providers CLI providers are cost-effective alternatives that use your existing subscriptions. They work differently from API providers as they execute CLI commands and integrate with the tools' native capabilities. See the CLI Providers guide for detailed setup instructions. :::

Configure Provider

To configure your chosen provider or see available options, visit the Models tab in the Goose Desktop or run goose configure in the CLI.

**To update your LLM provider and API key:** 1. Click the button in the top-left to open the sidebar 2. Click the `Settings` button on the sidebar 3. Click the `Models` tab 4. Click `Configure providers` 5. Click your provider in the list 6. Add your API key and other required configurations, then click `Submit`

To change your current model:

  1. Click the button in the top-left to open the sidebar
  2. Click the Settings button on the sidebar
  3. Click the Models tab
  4. Click Switch models
  5. Choose from your configured providers in the dropdown, or select Use other provider to configure a new one
  6. Select a model from the available options, or choose Use custom model to enter a specific model name
  7. Click Select model to confirm your choice

:::tip Shortcut For faster access, click your current model name at the bottom of the app and choose Change Model. :::

To start over with provider and model configuration:

  1. Click the button in the top-left to open the sidebar
  2. Click the Settings button on the sidebar
  3. Click the Models tab
  4. Click Reset Provider and Model to clear your current settings and return to the welcome screen 1. Run the following command:
```sh
goose configure
```

2. Select `Configure Providers` from the menu and press Enter.

```

┌ goose-configure │ ◆ What would you like to configure? │ ● Configure Providers (Change provider or update credentials) │ ○ Add Extension │ ○ Toggle Extensions │ ○ Remove Extension │ ○ Goose Settings └

3. Choose a model provider and press Enter.

┌ goose-configure │ ◇ What would you like to configure? │ Configure Providers │ ◆ Which model provider should we use? │ ● Anthropic (Claude and other models from Anthropic) │ ○ Azure OpenAI │ ○ Amazon Bedrock │ ○ Claude Code │ ○ Databricks │ ○ ... └

4. Enter your API key (and any other configuration details) when prompted.

┌ goose-configure │ ◇ What would you like to configure? │ Configure Providers │ ◇ Which model provider should we use? │ Anthropic │ ◆ Provider Anthropic requires ANTHROPIC_API_KEY, please enter a value │ ▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪ └
``` 5. Enter your desired ANTHROPIC_HOST or you can use the default one by hitting the Enter key.

```
◇  Enter new value for ANTHROPIC_HOST
│  https://api.anthropic.com (default)
```
6. Enter the model you want to use or you can use the default one by hitting the `Enter` key. 
```
│
◇  Model fetch complete
│
◇  Enter a model from that provider:
│  claude-sonnet-4-0 (default)
│
◓  Checking your configuration...
└  Configuration saved successfully
  </TabItem>
</Tabs>

## Using Custom OpenAI Endpoints

Goose supports using custom OpenAI-compatible endpoints, which is particularly useful for:
- Self-hosted LLMs (e.g., LLaMA, Mistral) using vLLM or KServe
- Private OpenAI-compatible API servers
- Enterprise deployments requiring data governance and security compliance
- OpenAI API proxies or gateways

### Configuration Parameters

| Parameter | Required | Description |
|-----------|----------|-------------|
| `OPENAI_API_KEY` | Yes | Authentication key for the API |
| `OPENAI_HOST` | No | Custom endpoint URL (defaults to api.openai.com) |
| `OPENAI_ORGANIZATION` | No | Organization ID for usage tracking and governance |
| `OPENAI_PROJECT` | No | Project identifier for resource management |
| `OPENAI_CUSTOM_HEADERS` | No | Additional headers to include in the request. Can be set via environment variable, configuration file, or CLI, in the format `HEADER_A=VALUE_A,HEADER_B=VALUE_B`. |

### Example Configurations

<Tabs groupId="deployment">
  <TabItem value="vllm" label="vLLM Self-Hosted" default>
    If you're running LLaMA or other models using vLLM with OpenAI compatibility:
    ```sh
    OPENAI_HOST=https://your-vllm-endpoint.internal
    OPENAI_API_KEY=your-internal-api-key
    ```
  </TabItem>
  <TabItem value="kserve" label="KServe Deployment">
    For models deployed on Kubernetes using KServe:
    ```sh
    OPENAI_HOST=https://kserve-gateway.your-cluster
    OPENAI_API_KEY=your-kserve-api-key
    OPENAI_ORGANIZATION=your-org-id
    OPENAI_PROJECT=ml-serving
    ```
  </TabItem>
  <TabItem value="enterprise" label="Enterprise OpenAI">
    For enterprise OpenAI deployments with governance:
    ```sh
    OPENAI_API_KEY=your-api-key
    OPENAI_ORGANIZATION=org-id123
    OPENAI_PROJECT=compliance-approved
    ```
  </TabItem>
  <TabItem value="custom-headers" label="Custom Headers">
    For OpenAI-compatible endpoints that require custom headers:
    ```sh
    OPENAI_API_KEY=your-api-key
    OPENAI_ORGANIZATION=org-id123
    OPENAI_PROJECT=compliance-approved
    OPENAI_CUSTOM_HEADERS="X-Header-A=abc,X-Header-B=def"
    ```
  </TabItem>
</Tabs>

### Setup Instructions

<Tabs groupId="interface">
  <TabItem value="ui" label="Goose Desktop" default>
    1. Click the <PanelLeft className="inline" size={16} /> button in the top-left to open the sidebar
    2. Click the `Settings` button on the sidebar
    3. Click the `Models` tab
    4. Click `Configure providers`
    5. Click `OpenAI` in the provider list
    6. Fill in your configuration details:
       - API Key (required)
       - Host URL (for custom endpoints)
       - Organization ID (for usage tracking)
       - Project (for resource management)
    7. Click `Submit`
  </TabItem>
  <TabItem value="cli" label="Goose CLI">
    1. Run `goose configure`
    2. Select `Configure Providers`
    3. Choose `OpenAI` as the provider
    4. Enter your configuration when prompted:
       - API key
       - Host URL (if using custom endpoint)
       - Organization ID (if using organization tracking)
       - Project identifier (if using project management)
  </TabItem>
</Tabs>

:::tip Enterprise Deployment
For enterprise deployments, you can pre-configure these values using environment variables or configuration files to ensure consistent governance across your organization.
:::

## Using Goose for Free

Goose is a free and open source AI agent that you can start using right away, but not all supported [LLM Providers][providers] provide a free tier. 

Below, we outline a couple of free options and how to get started with them.

:::warning Limitations
These free options are a great way to get started with Goose and explore its capabilities. However, you may need to upgrade your LLM for better performance.
:::


### Groq
Groq provides free access to open source models with high-speed inference. To use Groq with Goose, you need an API key from [Groq Console](https://console.groq.com/keys).

Groq offers several open source models that support tool calling:
- **moonshotai/kimi-k2-instruct** - Mixture-of-Experts model with 1 trillion parameters, optimized for agentic intelligence and tool use
- **qwen/qwen3-32b** - 32.8 billion parameter model with advanced reasoning and multilingual capabilities  
- **gemma2-9b-it** - Google's Gemma 2 model with instruction tuning
- **llama-3.3-70b-versatile** - Meta's Llama 3.3 model for versatile applications

To set up Groq with Goose, follow these steps:

<Tabs groupId="interface">
  <TabItem value="ui" label="Goose Desktop" default>
  **To update your LLM provider and API key:** 

    1. Click the <PanelLeft className="inline" size={16} /> button in the top-left to open the sidebar.
    2. Click the `Settings` button on the sidebar.
    3. Click the `Models` tab.
    4. Click `Configure Providers`
    5. Choose `Groq` as provider from the list.
    6. Click `Configure`, enter your API key, and click `Submit`.

  </TabItem>
  <TabItem value="cli" label="Goose CLI">
    1. Run: 
    ```sh
    goose configure
    ```
    2. Select `Configure Providers` from the menu.
    3. Follow the prompts to choose `Groq` as the provider.
    4. Enter your API key when prompted.
    5. Enter the Groq model of your choice (e.g., `moonshotai/kimi-k2-instruct`).
  </TabItem>
</Tabs>

### Google Gemini
Google Gemini provides a free tier. To start using the Gemini API with Goose, you need an API Key from [Google AI studio](https://aistudio.google.com/app/apikey).

To set up Google Gemini with Goose, follow these steps:

<Tabs groupId="interface">
  <TabItem value="ui" label="Goose Desktop" default>
  **To update your LLM provider and API key:** 

    1. Click the <PanelLeft className="inline" size={16} /> button in the top-left to open the sidebar.
    2. Click the `Settings` button on the sidebar.
    3. Click the `Models` tab.
    4. Click `Configure Providers`
    5. Choose `Google Gemini` as provider from the list.
    6. Click `Configure`, enter your API key, and click `Submit`.

  </TabItem>
  <TabItem value="cli" label="Goose CLI">
    1. Run: 
    ```sh
    goose configure
    ```
    2. Select `Configure Providers` from the menu.
    3. Follow the prompts to choose `Google Gemini` as the provider.
    4. Enter your API key when prompted.
    5. Enter the Gemini model of your choice.

    ```
    ┌   goose-configure
    │
    ◇ What would you like to configure?
    │ Configure Providers
    │
    ◇ Which model provider should we use?
    │ Google Gemini
    │
    ◇ Provider Google Gemini requires GOOGLE_API_KEY, please enter a value
    │▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪
    │    
    ◇ Enter a model from that provider:
    │ gemini-2.0-flash-exp
    │
    ◇ Hello! You're all set and ready to go, feel free to ask me anything!
    │
    └ Configuration saved successfully
    ```
  </TabItem>
</Tabs>


### Local LLMs

Goose is a local AI agent, and by using a local LLM, you keep your data private, maintain full control over your environment, and can work entirely offline without relying on cloud access. However, please note that local LLMs require a bit more set up before you can use one of them with Goose.

:::warning Limited Support for models without tool calling
Goose extensively uses tool calling, so models without it can only do chat completion. If using models without tool calling, all Goose [extensions must be disabled](/docs/getting-started/using-extensions#enablingdisabling-extensions).
:::

Here are some local providers we support:

<Tabs groupId="local-llms">
  <TabItem value="ollama" label="Ollama" default>
    <Tabs groupId="ollama-models">
      <TabItem value="ramalala" label="Ramalala">
        1. [Download Ramalama](https://github.com/containers/ramalama?tab=readme-ov-file#install).
        2. In a terminal, run any Ollama [model supporting tool-calling](https://ollama.com/search?c=tools) or [GGUF format HuggingFace Model](https://huggingface.co/search/full-text?q=%22tools+support%22+%2B+%22gguf%22&type=model):

          The `--runtime-args="--jinja"` flag is required for Ramalama to work with the Goose Ollama provider.

          Example:

          ```sh
          ramalama serve --runtime-args="--jinja" ollama://qwen2.5
          ```

          3. In a separate terminal window, configure with Goose:

          ```sh
          goose configure
          ```

          4. Choose to `Configure Providers`

          ```
          ┌   goose-configure
          │
          ◆  What would you like to configure?
          │  ● Configure Providers (Change provider or update credentials)
          │  ○ Toggle Extensions
          │  ○ Add Extension
          └
          ```

          5. Choose `Ollama` as the model provider since Ramalama is API compatible and can use the Goose Ollama provider

          ```
          ┌   goose-configure
          │
          ◇  What would you like to configure?
          │  Configure Providers
          │
          ◆  Which model provider should we use?
          │  ○ Anthropic
          │  ○ Databricks
          │  ○ Google Gemini
          │  ○ Groq
          │  ● Ollama (Local open source models)
          │  ○ OpenAI
          │  ○ OpenRouter
          └
          ```

          6. Enter the host where your model is running

          :::info Endpoint
          For the Ollama provider, if you don't provide a host, we set it to `localhost:11434`. When constructing the URL, we preprend `http://` if the scheme is not `http` or `https`. Since Ramalama's default port to serve on is 8080, we set `OLLAMA_HOST=http://0.0.0.0:8080`
          :::

          ```
          ┌   goose-configure
          │
          ◇  What would you like to configure?
          │  Configure Providers
          │
          ◇  Which model provider should we use?
          │  Ollama
          │
          ◆  Provider Ollama requires OLLAMA_HOST, please enter a value
          │  http://0.0.0.0:8080
          └
          ```


          7. Enter the model you have running

          ```
          ┌   goose-configure
          │
          ◇  What would you like to configure?
          │  Configure Providers
          │
          ◇  Which model provider should we use?
          │  Ollama
          │
          ◇  Provider Ollama requires OLLAMA_HOST, please enter a value
          │  http://0.0.0.0:8080
          │
          ◇  Enter a model from that provider:
          │  qwen2.5
          │
          ◇  Welcome! You're all set to explore and utilize my capabilities. Let's get started on solving your problems together!
          │
          └  Configuration saved successfully
          ```

          :::tip Context Length
          If you notice that Goose is having trouble using extensions or is ignoring [.goosehints](/docs/guides/using-goosehints), it is likely that the model's default context length of 2048 tokens is too low. Use `ramalama serve` to set the `--ctx-size, -c` option to a [higher value](https://github.com/containers/ramalama/blob/main/docs/ramalama-serve.1.md#--ctx-size--c).
          :::

      </TabItem>
      <TabItem value="deepseek" label="DeepSeek-R1">
        The native `DeepSeek-r1` model doesn't support tool calling, however, we have a [custom model](https://ollama.com/michaelneale/deepseek-r1-goose) you can use with Goose. 

        :::warning
        Note that this is a 70B model size and requires a powerful device to run smoothly.
        :::


        1. [Download Ollama](https://ollama.com/download). 
        2. In a terminal window, run the following command to install the custom DeepSeek-r1 model:

        ```sh
        ollama run michaelneale/deepseek-r1-goose
        ```

        3. In a separate terminal window, configure with Goose:

        ```sh
        goose configure
        ```

        4. Choose to `Configure Providers`

        ```
        ┌   goose-configure 
        │
        ◆  What would you like to configure?
        │  ● Configure Providers (Change provider or update credentials)
        │  ○ Toggle Extensions 
        │  ○ Add Extension 
        └  
        ```

        5. Choose `Ollama` as the model provider

        ```
        ┌   goose-configure 
        │
        ◇  What would you like to configure?
        │  Configure Providers 
        │
        ◆  Which model provider should we use?
        │  ○ Anthropic 
        │  ○ Databricks 
        │  ○ Google Gemini 
        │  ○ Groq 
        │  ● Ollama (Local open source models)
        │  ○ OpenAI 
        │  ○ OpenRouter 
        └  
        ```

        6. Enter the host where your model is running

        ```
        ┌   goose-configure 
        │
        ◇  What would you like to configure?
        │  Configure Providers 
        │
        ◇  Which model provider should we use?
        │  Ollama 
        │
        ◆  Provider Ollama requires OLLAMA_HOST, please enter a value
        │  http://localhost:11434
        └
        ```

        7. Enter the installed model from above

        ```
        ┌   goose-configure 
        │
        ◇  What would you like to configure?
        │  Configure Providers 
        │
        ◇  Which model provider should we use?
        │  Ollama 
        │
        ◇   Provider Ollama requires OLLAMA_HOST, please enter a value
        │  http://localhost:11434  
        │    
        ◇  Enter a model from that provider:
        │  michaelneale/deepseek-r1-goose
        │
        ◇  Welcome! You're all set to explore and utilize my capabilities. Let's get started on solving your problems together!
        │
        └  Configuration saved successfully
        ```
      </TabItem>
      <TabItem value="others" label="Other Models" default>
        1. [Download Ollama](https://ollama.com/download). 
        2. In a terminal, run any [model supporting tool-calling](https://ollama.com/search?c=tools)

          Example:

          ```sh
          ollama run qwen2.5
          ```

        3. In a separate terminal window, configure with Goose:

          ```sh
          goose configure
          ```

        4. Choose to `Configure Providers`

        ```
        ┌   goose-configure 
        │
        ◆  What would you like to configure?
        │  ● Configure Providers (Change provider or update credentials)
        │  ○ Toggle Extensions 
        │  ○ Add Extension 
        └  
        ```

        5. Choose `Ollama` as the model provider

        ```
        ┌   goose-configure 
        │
        ◇  What would you like to configure?
        │  Configure Providers 
        │
        ◆  Which model provider should we use?
        │  ○ Anthropic 
        │  ○ Databricks 
        │  ○ Google Gemini 
        │  ○ Groq 
        │  ● Ollama (Local open source models)
        │  ○ OpenAI 
        │  ○ OpenRouter 
        └  
        ```

        6. Enter the host where your model is running

        :::info Endpoint
        For Ollama, if you don't provide a host, we set it to `localhost:11434`. 
        When constructing the URL, we prepend `http://` if the scheme is not `http` or `https`. 
        If you're running Ollama on a different server, you'll have to set `OLLAMA_HOST=http://{host}:{port}`.
        :::

        ```
        ┌   goose-configure 
        │
        ◇  What would you like to configure?
        │  Configure Providers 
        │
        ◇  Which model provider should we use?
        │  Ollama 
        │
        ◆  Provider Ollama requires OLLAMA_HOST, please enter a value
        │  http://localhost:11434
        └
        ```


        7. Enter the model you have running

        ```
        ┌   goose-configure 
        │
        ◇  What would you like to configure?
        │  Configure Providers 
        │
        ◇  Which model provider should we use?
        │  Ollama 
        │
        ◇  Provider Ollama requires OLLAMA_HOST, please enter a value
        │  http://localhost:11434
        │
        ◇  Enter a model from that provider:
        │  qwen2.5
        │
        ◇  Welcome! You're all set to explore and utilize my capabilities. Let's get started on solving your problems together!
        │
        └  Configuration saved successfully
        ```

        :::tip Context Length
        If you notice that Goose is having trouble using extensions or is ignoring [.goosehints](/docs/guides/using-goosehints), it is likely that the model's default context length of 4096 tokens is too low. Set the `OLLAMA_CONTEXT_LENGTH` environment variable to a [higher value](https://github.com/ollama/ollama/blob/main/docs/faq.md#how-can-i-specify-the-context-window-size).
        :::
        
      </TabItem>
    </Tabs>
  </TabItem>
  <TabItem value="docker" label="Docker Model Runner" default>
    1. [Get Docker](https://docs.docker.com/get-started/get-docker/)
    2. [Enable Docker Model Runner](https://docs.docker.com/ai/model-runner/#enable-dmr-in-docker-desktop)
    3. [Pull a model](https://docs.docker.com/ai/model-runner/#pull-a-model), for example, from Docker Hub [AI namespace](https://hub.docker.com/u/ai), [Unsloth](https://hub.docker.com/u/unsloth), or [from HuggingFace](https://www.docker.com/blog/docker-model-runner-on-hugging-face/)

    Example:

    ```sh
    docker model pull hf.co/unsloth/gemma-3n-e4b-it-gguf:q6_k
    ```

    4. Configure Goose to use Docker Model Runner, using the OpenAI API compatible endpoint: 

    ```sh
    goose configure
    ```

    5. Choose to `Configure Providers`

    ```
    ┌   goose-configure 
    │
    ◆  What would you like to configure?
    │  ● Configure Providers (Change provider or update credentials)
    │  ○ Toggle Extensions 
    │  ○ Add Extension 
    └  
    ```

    6. Choose `OpenAI` as the model provider: 

    ```
    ┌   goose-configure
    │
    ◇  What would you like to configure?
    │  Configure Providers
    │
    ◆  Which model provider should we use?
    │  ○ Anthropic
    │  ○ Amazon Bedrock
    │  ○ Claude Code
    │  ● OpenAI (GPT-4 and other OpenAI models, including OpenAI compatible ones)
    │  ○ OpenRouter
    ```

    7. Configure Docker Model Runner endpoint as the `OPENAI_HOST`: 

    ```
    ┌   goose-configure
    │
    ◇  What would you like to configure?
    │  Configure Providers
    │
    ◇  Which model provider should we use?
    │  OpenAI
    │
    ◆  Provider OpenAI requires OPENAI_HOST, please enter a value
    │  https://api.openai.com (default)
    └
    ```

    The default value for the host-side port Docker Model Runner is 12434, so the `OPENAI_HOST` value could be: 
    `http://localhost:12434`. 

    8. Configure the base path: 

    ```
    ◆  Provider OpenAI requires OPENAI_BASE_PATH, please enter a value
    │  v1/chat/completions (default)
    └
    ```

    Docker model runner uses `/engines/llama.cpp/v1/chat/completions` for the base path.

    9. Finally configure the model available in Docker Model Runner to be used by Goose: `hf.co/unsloth/gemma-3n-e4b-it-gguf:q6_k`

    ```
    │
    ◇  Enter a model from that provider:
    │  gpt-4o
    │
    ◒  Checking your configuration...                                                                                                            
    └  Configuration saved successfully
    ```
  </TabItem>
</Tabs>



## Azure OpenAI Credential Chain

Goose supports two authentication methods for Azure OpenAI:

1. **API Key Authentication** - Uses the `AZURE_OPENAI_API_KEY` for direct authentication
2. **Azure Credential Chain** - Uses Azure CLI credentials automatically without requiring an API key

To use the Azure Credential Chain:
- Ensure you're logged in with `az login`
- Have appropriate Azure role assignments for the Azure OpenAI service
- Configure with `goose configure` and select Azure OpenAI, leaving the API key field empty

This method simplifies authentication and enhances security for enterprise environments.

## Multi-Model Configuration

Beyond single-model setups, goose supports [multi-model configurations](/docs/guides/multi-model/) that can use different models and providers for specialized tasks:

- **AutoPilot** - Intelligent, context-aware switching between specialized models based on conversation content and complexity
- **Lead/Worker Model** - Automatic switching between a lead model for initial turns and a worker model for execution tasks
- **Planning Mode** - Manual planning phase using a dedicated model to create detailed project breakdowns before execution

---

If you have any questions or need help with a specific provider, feel free to reach out to us on [Discord](https://discord.gg/block-opensource) or on the [Goose repo](https://github.com/block/goose).


[providers]: /docs/getting-started/providers
[function-calling-leaderboard]: https://gorilla.cs.berkeley.edu/leaderboard.html