> ## Documentation Index
> Fetch the complete documentation index at: https://docs.jacobpevans.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Bifrost AI gateway

> OpenAI-compatible HTTP gateway that routes AI requests from local tools to the right provider — OpenAI, Gemini, OpenRouter, or local MLX inference.

> One endpoint for every AI tool. Bifrost handles the routing.

Bifrost is the OpenAI-compatible HTTP gateway that sits between every AI tool on the workstation and whichever provider eventually answers the call. It exposes `http://localhost:30080/v1/chat/completions` and fans out to OpenAI, Gemini, OpenRouter, and the local MLX server based on the task class.

* **GitHub:** [https://github.com/maximhq/bifrost](https://github.com/maximhq/bifrost)
* **Homepage:** [https://www.getmaxim.ai/bifrost](https://www.getmaxim.ai/bifrost)

## Model routing conventions

Never hardcode model identifiers in committed config. Models change frequently; identifiers rot. Tools resolve task classes (Research, Coding, Review, Pre-commit) to a current model at call time via `listmodels`.

| Context                          | Format                                                 |
| -------------------------------- | ------------------------------------------------------ |
| Local MLX models through Bifrost | `mlx-local/<model>` — Bifrost expects `provider/model` |
| Direct vllm-mlx on port 11434    | bare HuggingFace model ID — no prefix                  |
| Cloud models through Bifrost     | unprefixed — Bifrost routes by task class              |

## Local-only mode

When `localOnlyMode` is enabled or the `--local` flag is passed, every request routes to the MLX inference server on port 11434. No cloud API calls occur.

Verify the LaunchAgent is running before enabling local-only mode:

```bash theme={null}
launchctl list | grep vllm-mlx
```

## Priority in the AI gateway stack

Bifrost is the second layer in the gateway priority order:

1. **Anthropic official** — Claude Code plugins, skills, patterns
2. **Bifrost AI gateway** — multi-provider routing at `localhost:30080`
3. **Personal or custom** — only when no alternative exists

## Capabilities

Bifrost supports 23+ providers including OpenAI, Anthropic, AWS Bedrock, Google Vertex, and local inference servers. Key features:

* **Intelligent failover** — transparent routing to a configured fallback when a provider is unavailable
* **Semantic caching** — caches responses by semantic similarity, reducing cost and latency
* **MCP support** — Model Context Protocol integration for multi-tool coordination
* **Prometheus metrics** — built-in observability for latency, throughput, and cost tracking

Performance at scale: \<100 µs gateway overhead at 5,000 RPS.

## Deployment

Bifrost runs locally as a lightweight gateway process. Options:

```bash theme={null}
npx bifrost@latest       # 30-second startup via NPX
```

Docker containers and a Go SDK are also available for embedded or orchestrated deployments.

## See also

* [AI development pipeline](/architecture/ai-pipeline) — how Bifrost fits into the full model-routing pipeline
* [nix-ai](/nix/nix-ai) — Nix package and config layer that manages the Bifrost process
