Introduction
Leashly is an AI cost control proxy that sits between your application and any LLM provider. It enforces spend caps, rate limits, and prompt injection protection without requiring any changes to your application code beyond a single environment variable.
What does Leashly do?
When your app makes a request to an LLM like GPT-4 or Claude, that request goes through Leashly first. Leashly checks your configured rules — spend caps, rate limits, injection filters — and either forwards the request to the provider or blocks it with a clean error response.
Cost control
Daily, weekly, monthly spend caps per key or account
Rate limiting
Per-minute, per-hour request throttling
Injection protection
Blocks 50+ known jailbreak and extraction patterns
Key concepts
Proxy key — A Leashly-issued key (prefixed lsh_) that your app uses instead of your real provider API key. Leashly maps this to your real key server-side.
Rules — Configurable policies applied to every proxied request. Three types: spend caps, rate limits, and injection filters.
Alerts — Email notifications triggered when a rule threshold is reached.
Request log — A record of every proxied request including tokens, cost, duration, model, and provider.