frugal

Route every LLM call to the cheapest model that won't compromise quality.

Frugal wraps any command with a local OpenAI-compatible proxy, classifies each request, and routes to the cheapest model that clears the quality bar. No account. No code changes. One command.

$ curl -fsSL https://frugal.sh/install | sh
$ frugal python my_app.py
# frugal starts a proxy, sets OPENAI_BASE_URL, runs your command,
# and shuts down when it exits. Your app doesn't change.
No account No code changes Keys stay local Signed releases

The insight

You're paying for capability you don't use on 60–80% of your LLM calls. A creative brainstorm doesn't need o3. A simple extraction doesn't need claude-opus. Frugal classifies each request — complexity, domain, capabilities required — and picks the cheapest model that clears the quality bar.

Classifier
Detects code, math, tool-use, JSON output, conversation depth, and domain. Combines them into a complexity score, 0.0 – 1.0.
Router
Per-quality-tier thresholds on reasoning, coding, creative, and instruction-following. Cheapest model that clears all thresholds wins.
Fallback
Routed model errors? Walk a caller-supplied chain or drop to a looser quality tier. Bounded retries so latency stays predictable.

How it works

frugal python app.py
       │
       ├─ starts proxy on a free port
       ├─ injects OPENAI_BASE_URL into your command's environment
       ├─ classifies each request (complexity, domain, capabilities)
       ├─ routes to cheapest model that clears the quality bar
       └─ shuts down proxy when your command exits

Works with any OpenAI-compatible SDK

Frugal speaks the OpenAI chat-completions API. Point your existing SDK at the proxy and nothing else changes.

Python
# unchanged from before frugal
from openai import OpenAI
client = OpenAI()
resp = client.chat.completions.create(
    model="auto",
    messages=[{"role": "user", "content": "hi"}],
)
Node
// unchanged from before frugal
import OpenAI from "openai";
const client = new OpenAI();
const resp = await client.chat.completions.create({
  model: "auto",
  messages: [{ role: "user", content: "hi" }],
});
curl
curl "$OPENAI_BASE_URL/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{"model":"auto","messages":[{"role":"user","content":"hi"}]}'
Go
client := openai.NewClient(os.Getenv("OPENAI_API_KEY"))
resp, _ := client.CreateChatCompletion(ctx, openai.ChatCompletionRequest{
    Model: "auto",
    Messages: []openai.ChatCompletionMessage{{
        Role: openai.ChatMessageRoleUser, Content: "hi",
    }},
})

Three quality tiers — pick per request

Send X-Frugal-Quality: cost for a quick extraction. Send high for your agent's planner step. Default balanced for everything else.

high

Top-tier models only. Use for planners, complex reasoning, novel code.

X-Frugal-Quality: high

cost

Cheapest viable model. Classification, extraction, simple summaries.

X-Frugal-Quality: cost

Models routed today

Pricing synced from models.dev on every startup. Add more by editing ~/.frugal/config/models.yaml.

OpenAI

GPT-4o · GPT-4o-mini · GPT-4.1 · GPT-4.1-mini · GPT-4.1-nano

Anthropic

Claude Opus 4 · Claude Sonnet 4 · Claude Haiku 3.5

Google

Gemini 2.5 Pro · Gemini 2.5 Flash · Gemini 2.0 Flash

Install

One binary, ~10 MB. Detects your API keys. Adds frugal to your PATH. Release artifacts are signed with cosign; the installer verifies the checksum before moving the binary into place.

$ curl -fsSL https://frugal.sh/install | sh

From source

$ git clone https://github.com/brainsparker/frugal.git && cd frugal && make build

Run as a server

$ export FRUGAL_AUTH_TOKEN=$(openssl rand -hex 16)
$ frugal serve
$ export OPENAI_BASE_URL=http://localhost:8080/v1

When binding outside 127.0.0.1, Frugal refuses to start without an auth token (override with FRUGAL_ALLOW_UNAUTH=1). Prometheus metrics are served at /metrics. Full env-var reference in the README.

Built to self-host

Your keys stay local
Provider API keys never leave your machine. Frugal reads them from the environment and forwards requests upstream; there is no control plane.
Signed releases
Every binary is cosign-signed (keyless, GitHub OIDC). The installer verifies the checksum and, when cosign is present, the signature before executing anything.
Observable by default
Structured logs, X-Request-ID propagation, Prometheus /metrics, and /v1/routing/explain to see exactly which model was picked and why.
Source-available, BUSL 1.1
Self-hosted and internal commercial use is permitted. Each version converts to Apache 2.0 four years after release.