Welcome to the Metriqual blog. This is where we’ll write about what we’re building, the hard parts of running AI in production, and how to ship faster without drowning in provider plumbing. Since this is the first post, let’s start with the obvious question: what is Metriqual, and why does it exist?
The problem: shipping the AI feature is easy. The infrastructure isn’t.
Getting a prototype working with one model takes an afternoon. Getting it production-grade is where teams lose weeks. The moment real traffic shows up, you inherit a second job nobody scoped:
- A different SDK, auth scheme, and error shape for every provider — OpenAI, Anthropic, Gemini, Mistral, MiniMax.
- Rate limits and outages that take your feature down with them, so you need failover logic.
- No idea who’s spending what, until the bill arrives — so you need per-key spend controls.
- Raw prompts and responses flowing through with no PII protection, logging, or audit trail.
- Zero observability into latency, tokens, and cost per request.
None of that is your product. It’s tax. Metriqual is the layer that pays it for you.
What Metriqual is
One key gives your team every AI modality — text, images, audio, video, music, voice cloning — across every major provider, with automatic failover, spend controls, and full observability. One integration. Production-grade from day one.
Metriqual is an AI gateway. Your application talks to one endpoint with one key; we handle routing to the underlying providers, falling over when one degrades, enforcing your limits and filters, and recording everything so you can actually see what’s happening.
What your engineering team stops building
- One OpenAI-compatible API — a drop-in replacement. Keep your existing OpenAI SDK; change the base URL.
- Multi-model routing — reach OpenAI, Anthropic, Gemini, Mistral, and MiniMax through a single interface, across text, image, audio, video, music, and voice.
- Automatic failover — when a provider hits a limit or goes down, requests reroute seamlessly instead of erroring out.
- Proxy keys & spend controls — issue scoped keys per app, team, or user with usage limits baked in.
- Content filtering — redact or block emails, SSNs, credit cards, and other PII on the way in and the way out.
- Production observability — cost, tokens, and latency tracked per key and per request.
The drop-in: change one line
Because the API is OpenAI-compatible, you don’t rewrite your app. Point your existing client at Metriqual and you instantly get routing, failover, limits, and logging behind it:
from openai import OpenAI
client = OpenAI(
api_key="mql-your-proxy-key", # your Metriqual key
base_url="https://api.metriqual.com/v1", # the only change
)
resp = client.chat.completions.create(
model="gpt-4o", # swap to claude, gemini, mistral… same call
messages=[{"role": "user", "content": "Hello from Metriqual"}],
)
print(resp.choices[0].message.content)
Prefer a native SDK? There’s @metriqual/sdk for TypeScript/JavaScript and a Python SDK, both with streaming, A/B testing, and feedback built in.
How it works
- Add your providers — bring your API keys and set usage limits.
- Configure routing & filters — pick models and turn on optional PII filtering.
- Create a proxy key — scoped, with its own spend cap.
- Point your app at it — one base URL, and you’re production-grade.
What we’ll cover here
This blog is for the people doing the work: engineers and teams putting AI in front of real users. Expect deep dives on gateway performance and failover, cost-optimization playbooks, observability patterns, and notes on what we’re shipping. No fluff — the stuff we wish we’d read before building this ourselves.
Get started
You can have a proxy key and your first routed request in a few minutes.
Welcome aboard. — The Metriqual team
