Leashly sits between your app and any LLM provider. Enforce spend caps, rate limits, and prompt injection protection — in one env var change.
No credit card required · 5 minute setup · Works with OpenAI, Anthropic, Gemini
| 1 | // Before |
| 2 | const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY }) |
| 3 | |
| 4 | // After — that's it. |
| 5 | const openai = new OpenAI({ |
| 6 | apiKey: "lsh_xxxxxxxxxxxx", |
| 7 | baseURL: "https://api.leashly.dev/proxy" |
| 8 | }) |
There are no guardrails between your app and the LLM API. One misconfigured feature, one abusive user, or one runaway script — and your next invoice is unrecognizable.
No rate limits means no friction for abuse. No spend caps means no ceiling on damage. No attribution means no idea which user, feature, or bug caused it.
The same interface your SDK already uses. Zero refactoring.
Set spend caps per user, per day, per model. Rate limits that actually work. Injection filter that catches attacks before they hit the model.
Every token, every request, every dollar — attributed to the exact user, feature, and model that spent it. No more mystery invoices.
Change one environment variable. Leashly is fully compatible with the OpenAI SDK. Your app doesn't know the difference.
Built for production from day one.
Daily, weekly, monthly limits per key or per user. Block or alert when thresholds are hit.
Per-minute, per-hour throttling with token bucket algorithm. Per account, key, or IP.
Blocks 50+ known jailbreak and extraction patterns. Three sensitivity levels.
See exactly which user and feature is burning money. Full model breakdown.
Email and in-app notifications when spend thresholds or rate limits are hit.
Every request logged with tokens, cost, duration, model, and flag reason.
One line change. Drop-in compatible.
| 1 | import OpenAI from 'openai'; |
| 2 | |
| 3 | const client = new OpenAI({ |
| 4 | apiKey: process.env.LEASHLY_KEY, // your lsh_xxx key |
| 5 | baseURL: 'https://api.leashly.dev/proxy', |
| 6 | }); |
| 7 | |
| 8 | const response = await client.chat.completions.create({ |
| 9 | model: 'gpt-4o', |
| 10 | messages: [{ role: 'user', content: 'Hello!' }], |
| 11 | }); |
Saves itself in week one.
No. The proxy runs in the same region as your LLM provider. Typical overhead is under 5ms.
Yes. Keys are encrypted at rest with AES-256. We never log or expose them in any response.
Yes. Leashly fully supports server-sent events (SSE) streaming responses, passing them through transparently.
OpenAI, Anthropic, Google Gemini, and any OpenAI-compatible endpoint. Add custom endpoints in the dashboard.
Leashly returns a 429 with a clear JSON error: { error: { message: 'Daily spend cap exceeded', type: 'rate_limit_error' } }. Your app gets a clean error to handle.
Yes. Leashly is open-source. Deploy it on Vercel, Railway, or any Node.js host in minutes with a single .env change.
Free forever for indie devs. No credit card required.
Create your free account →