Leashly/Docs
docsIntroduction

Introduction

Leashly is an AI cost control proxy that sits between your application and any LLM provider. It enforces spend caps, rate limits, and prompt injection protection without requiring any changes to your application code beyond a single environment variable.

What does Leashly do?

When your app makes a request to an LLM like GPT-4 or Claude, that request goes through Leashly first. Leashly checks your configured rules — spend caps, rate limits, injection filters — and either forwards the request to the provider or blocks it with a clean error response.

Cost control

Daily, weekly, monthly spend caps per key or account

Rate limiting

Per-minute, per-hour request throttling

Injection protection

Blocks 50+ known jailbreak and extraction patterns

Key concepts

Proxy key — A Leashly-issued key (prefixed lsh_) that your app uses instead of your real provider API key. Leashly maps this to your real key server-side.

Rules — Configurable policies applied to every proxied request. Three types: spend caps, rate limits, and injection filters.

Alerts — Email notifications triggered when a rule threshold is reached.

Request log — A record of every proxied request including tokens, cost, duration, model, and provider.

Your real API keys are never exposed to your frontend or logged anywhere. They are stored encrypted with AES-256 and only decrypted inside the proxy at request time.
© 2025 Leashly