subhashladumor1/laravel-ai-guard
ๆๆฐ็จณๅฎ็ๆฌ๏ผ1.0.2
Composer ๅฎ่ฃ ๅฝไปค๏ผ
composer require subhashladumor1/laravel-ai-guard
ๅ ็ฎไป
Laravel AI Guard ๐ก๏ธ โ AI cost & budget control for Laravel AI SDK. Track token usage, control OpenAI & LLM spending, enforce AI budgets, and prevent unexpected billing spikes.
ๅ ณ้ฎๅญ๏ผ
README ๆๆกฃ
README
Track costs โข Set budgets โข Never get surprised by the bill.
Laravel AI Guard is a powerful AI cost optimization package built for the Laravel AI SDK (12.x) ๐. It helps Laravel developers track OpenAI & LLM token usage ๐, estimate AI costs before execution โ ๏ธ, enforce per-user or per-tenant AI budgets ๐งพ, and prevent unexpected AI billing spikes ๐ฅ in production.
Designed for Laravel SaaS applications, APIs, and AI-powered platforms, Laravel AI Guard acts as a financial firewall ๐ก๏ธ between your app and AI providersโkeeping AI usage safe, predictable, and cost-efficient ๐ธ.
---๐ Quick Navigation
| Jump to | Jump to |
|---|---|
| What's Inside | How It Works |
| Quick Start | Usage Examples |
| Configuration | Package Structure |
โจ What's Inside
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ
โ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โ
โ โ TRACK โ โ BUDGET โ โ ESTIMATE โ โ BLOCK โ โ
โ โ Every call โ โ Per user/ โ โ Before you โ โ Over-spend โ โ
โ โ in DB โ โ tenant/app โ โ call (free) โ โ requests โ โ
โ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โ
โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ ๐จ KILL SWITCH โ โ
โ โ Disable all AI โ โ
โ โ in one config change โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Works with: Laravel AI SDK (12.x) โข OpenAI โข Anthropic โข Any AI API
๐ How It Works
Request Flow (Before โ During โ After)
flowchart TD
subgraph BEFORE["๐ก๏ธ BEFORE"]
A[Request arrives] --> B{Budget OK?}
B -->|Yes| C[Optional: Estimate cost]
B -->|No| D[โ Block - 402]
C --> E[Continue]
end
subgraph DURING["โก DURING"]
E --> F[Your app calls AI]
F --> G[Laravel AI SDK or any API]
end
subgraph AFTER["๐ AFTER"]
G --> H[Record tokens, cost, user]
H --> I[Save to ai_usages]
I --> J[Update ai_budgets]
end
BEFORE --> DURING --> AFTER
Loading
Budget Hierarchy (Checked in Order)
flowchart LR
subgraph layers["Budget layers checked top to bottom"]
direction TB
A["๐ GLOBAL<br/>Whole app limit"]
B["๐ข TENANT<br/>Org/team limit"]
C["๐ค USER<br/>Per-user limit"]
end
A --> B --> C
C --> D{All OK?}
D -->|Yes โ| E[Allow request]
D -->|Any exceeded โ| F[Block - 402]
Loading
TL;DR: Laravel AI SDK does the AI. Laravel AI Guard decides whether you're allowed to call and how much you spent. They work together.
๐ค Why Should I Care?
WITHOUT AI GUARD WITH AI GUARD
โโโโโโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ ๐ธ Surprise bill โ โ ๐ Full visibility โ
โ ๐ Runaway loop? โ โ โ ๐ Budget limits โ
โ ๐ฐ Invoice shock โ โ ๐ Predictable costs โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโ
AI APIs charge by the token. One heavy user, one bugโand your bill spikes. Most apps don't track until the invoice arrives. AI Guard gives you visibility, limits, and control.
๐ Under the Hood
Cost Calculation
flowchart LR
subgraph inputs["Usage Inputs"]
A[Input Tokens]
B[Output Tokens]
C[Cache Hits/Writes]
D[Images/Audio/Video]
end
subgraph calculation["Calculation"]
E["Text Cost<br/>(Standard + Long Context)"]
F["Cache Cost<br/>(Read + Write)"]
G["Multimodal Cost<br/>(Pixel/Second/Token)"]
end
subgraph result["Total"]
H["Total Cost $"]
end
A --> E
B --> E
C --> F
D --> G
E --> H
F --> H
G --> H
Loading
Example: 500 input + 200 output tokens (gpt-4o: $0.0025/1k in, $0.01/1k out)
| Step | Calculation | Result |
|---|---|---|
| Input cost | (500 รท 1000) ร 0.0025 | $0.00125 |
| Output cost | (200 รท 1000) ร 0.01 | $0.00200 |
| Total | $0.00325 |
Cost Optimization (Context Caching) โก
Laravel AI Guard supports advanced pricing models including Context Caching (Anthropic, Gemini, OpenAI) to help you track savings accuracy.
Supported Pricing Dimensions:
- Input Tokens (Standard)
- Output Tokens (Standard)
- Cached Input Tokens (Read from cache โ typically ~50-90% cheaper)
- Cache Creation Tokens (Write to cache โ sometimes higher cost)
- Long context (e.g. >200k tokens โ premium
input_long/output_longrates) - Modality-specific: image tokens, audio tokens, per image, per second video, per minute transcription, TTS per 1M characters, web search per 1k calls, embeddings per 1k tokens
Configuration Example (config/ai-guard.php):
'claude-3-5-sonnet' => [ 'input' => 0.003, 'output' => 0.015, 'cache_write' => 0.00375, // +25% overhead 'cached_input' => 0.0003, // -90% savings ],
The package automatically detects cache usage from provider responses and applies the correct lower rate.
Supported Providers, Models & Cost Coverage ๐
Pricing is aligned with official 2026 API docs for maximum accurate cost calculation across Chat, Assistants, Agents, and modality-specific use cases.
| Provider | Pricing Source | Coverage |
|---|---|---|
| OpenAI | Pricing | GPT-5.x, GPT-4o, o1, Realtime (Audio/Text), DALLยทE 3, Whisper, TTS, Web Search |
| Google Gemini | Pricing | Gemini 3 Pro/Flash, 2.5 Pro/Flash, 1.5, Imagen 3, Veo (Video), Embeddings |
| Anthropic | Pricing | Claude 4.5, 3.5 Sonnet, 3 Opus, Haiku, Prompt Caching, Long Context |
| xAI Grok | Models | Grok 4, Grok 3, Grok Beta, Web Search Tool |
| Mistral AI | Pricing | Mistral Large 2, Small, Codestral, Embeddings |
| DeepSeek | Pricing | DeepSeek-V3, R1 (Reasoner), Cache Hit/Miss pricing |
Full Multimodal Cost Support:
- LLM / Chat: Input, Output, Cached Input, Cache Write, Long-Context pricing
- Agents: Web Search (per 1k calls), Code Interpreter (Session based)
- Audio:
- Input: Audio tokens (e.g. Gemini 2.5 Flash
audio_in, GPT-4oaudio_in) - Output: Audio tokens (e.g. GPT-4o
audio_out) - Transcription: Per minute (Whisper)
- TTS: Per 1M characters (OpenAI TTS)
- Input: Audio tokens (e.g. Gemini 2.5 Flash
- Video:
- Input: Video tokens (e.g. Gemini
video_in) - Generation: Per second (Veo
per_second_video)
- Input: Video tokens (e.g. Gemini
- Image:
- Input: Image tokens (e.g. GPT-4o
image_in) - Generation: Per image (DALLยทE 3, Imagen)
- Input: Image tokens (e.g. GPT-4o
- Embeddings: Per 1k tokens
Pass extended usage when recording to get accurate totals:
AIGuard::recordAndApplyBudget([ 'provider' => 'gemini', 'model' => 'gemini-2.5-flash', 'input_tokens' => 1000, 'output_tokens' => 200, 'usage' => [ 'input_tokens' => 1000, // Text tokens 'output_tokens' => 200, // Text output 'video_tokens_in' => 5000, // Video understanding tokens 'audio_tokens_in' => 2000, // Audio input tokens 'images_generated' => 1, // Image gen quantity 'web_search_calls' => 2, // Per-call tool usage ], 'user_id' => auth()->id(), ]);
Estimation (No API Call = No Cost)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ AIGuard::estimate($prompt) โ
โ โ
โ Input tokens โ characters รท 4 (configurable) โ
โ Output tokens โ input ร 0.5 (configurable) โ
โ โ
โ "Write a short poem" (18 chars) โ ~5 in, ~3 out โ 8 โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Kill Switch
| Method | How |
|---|---|
.env (recommended) |
AI_GUARD_DISABLED=true |
| Config | 'ai_disabled' => true |
Result: Middleware returns 503 Service Unavailable โ no AI calls get through.
๐ก 5 Ways to Reduce AI Costs
โ ESTIMATE โก BUDGET โข TRACK โฃ KILL SWITCH โค TAG
โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ
โ Show cost โ โ Set limits โ โ Run report โ โ Emergency โ โ Break down โ
โ before call โ โ per user/ โ โ to see โ โ stop all โ โ by feature โ
โ โ โ tenant โ โ where $ goesโ โ AI if neededโ โ (chat, etc) โ
โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ
๐ Requirements
| Requirement | Version |
|---|---|
| PHP | 8.1+ |
| Laravel | 10.x, 11.x, or 12.x |
| Laravel AI SDK | Optional (for agents/streaming) |
๐ Quick Start (3 Steps)
flowchart LR
subgraph step1["Step 1"]
A[composer require]
end
subgraph step2["Step 2"]
B[publish config<br/>& migrations]
end
subgraph step3["Step 3"]
C[migrate]
end
A --> B --> C
Loading
1. Install
composer require subhashladumor1/laravel-ai-guard
2. Publish & migrate
php artisan vendor:publish --tag=ai-guard-config php artisan vendor:publish --tag=ai-guard-migrations php artisan migrate
3. Optional โ translations
php artisan vendor:publish --tag=ai-guard-lang
creates: ai_usages (tracks every request & cost) + ai_budgets (stores current usage vs limit)
โ๏ธ Configuration
Edit config/ai-guard.php after publishing:
| Setting | Purpose |
|---|---|
ai_disabled |
Turn off all AI |
pricing |
Cost per 1k tokens per model |
default_model |
Fallback (e.g. gpt-4o) |
default_provider |
Fallback (e.g. openai) |
budgets |
Limits (global, user, tenant); period |
estimation |
Chars per token, output multiplier |
Example .env:
AI_GUARD_DISABLED=false AI_GUARD_GLOBAL_LIMIT=100 AI_GUARD_USER_LIMIT=10 AI_GUARD_TENANT_LIMIT=50
๐ Usage Examples
With Laravel AI SDK (12.x)
sequenceDiagram
participant App
participant AIGuard
participant AI
App->>AIGuard: checkAllBudgets()
App->>AIGuard: estimate(prompt)
App->>AI: prompt()
AI-->>App: response
App->>AIGuard: recordFromResponse()
Loading
// 1. Before โ check budget AIGuard::checkAllBudgets(auth()->id(), $tenantId); $estimate = AIGuard::estimate($userPrompt); // 2. Call AI (as normal) $response = (new YourAgent)->prompt($userPrompt); // 3. After โ record usage AIGuard::recordFromResponse($response, userId: auth()->id(), tenantId: $tenantId, tag: 'chat');
Multi-model: Pass model and provider so estimate and budgets use the right cost:
$estimate = AIGuard::estimate($userPrompt, model: 'gpt-4o-mini', provider: 'openai'); AIGuard::recordFromResponse($response, userId: auth()->id(), provider: 'openai', model: 'gpt-4o-mini');
Streaming: record in ->then() callback when stream finishes.
With Any Other AI API
// Before โ same AIGuard::checkAllBudgets(auth()->id(), $tenantId); // After โ record manually AIGuard::recordAndApplyBudget([ 'provider' => 'openai', 'model' => 'gpt-4o', 'input_tokens' => 400, 'output_tokens' => 250, 'user_id' => auth()->id(), 'tenant_id' => $tenantId, 'tag' => 'chat', ]);
Extended usage (audio, video, image, tools) โ pass a usage array for accurate cost when using modalities or tools:
AIGuard::recordAndApplyBudget([ 'provider' => 'openai', 'model' => 'gpt-4o', 'input_tokens' => 500, 'output_tokens' => 300, 'usage' => [ 'input_tokens' => 500, 'output_tokens' => 300, 'cached_input_tokens' => 0, 'images_generated' => 2, // DALLยทE / image models 'web_search_calls' => 5, // agent tool calls 'transcription_minutes' => 1.5, // Whisper / transcribe 'tts_characters' => 2500, // TTS 'embedding_tokens' => 1000, // embeddings 'video_seconds' => 10, // Veo / video gen ], 'user_id' => auth()->id(), 'tag' => 'agent-with-search', ]);
Multi-model and dynamic cost (no config change)
Cost is resolved in order: per-call override โ runtime pricing โ config. So you can support many models and change costs at runtime without editing config/ai-guard.php.
1. Per-call pricing override โ pass pricing for a single estimate or record:
// Estimate with custom cost per 1k tokens (no config entry needed) $estimate = AIGuard::estimate($userPrompt, 'my-model', 'my-provider', [ 'input' => 0.001, 'output' => 0.002, ]); // Record with custom pricing when cost isn't pre-calculated AIGuard::recordFromResponse($response, auth()->id(), $tenantId, 'openai', 'gpt-4o', 'chat', [ 'input' => 0.0025, 'output' => 0.01, ]); // record() can omit 'cost' and use 'pricing' to calculate AIGuard::record([ 'provider' => 'openai', 'model' => 'gpt-4o', 'input_tokens' => 400, 'output_tokens' => 250, 'pricing' => ['input' => 0.0025, 'output' => 0.01], 'user_id' => auth()->id(), ]);
2. Runtime pricing registry โ register models once (e.g. in a service provider or from DB); then estimate() and recording use them automatically:
$calc = AIGuard::getCostCalculator(); // Single model $calc->setPricing('openai', 'gpt-4o-mini', ['input' => 0.00015, 'output' => 0.0006]); // Many models at once $calc->setPricingMap([ 'openai' => [ 'gpt-4o' => ['input' => 0.0025, 'output' => 0.01], 'gpt-4o-mini' => ['input' => 0.00015, 'output' => 0.0006], ], 'anthropic' => [ 'claude-3-5-sonnet' => ['input' => 0.003, 'output' => 0.015], ], ]); // Now estimate/record use these models without config $estimate = AIGuard::estimate($userPrompt, 'gpt-4o-mini', 'openai'); AIGuard::checkAllBudgets(auth()->id(), $tenantId);
Add, update or remove models at runtime:
$calc = AIGuard::getCostCalculator(); // Add or update a model $calc->setPricing('openai', 'gpt-4o', ['input' => 0.0025, 'output' => 0.01]); // Remove a model from runtime (falls back to config, or 0 if not in config) $calc->removePricing('openai', 'gpt-4o'); // Clear all runtime pricing $calc->clearRuntimePricing();
Config file โ publish and edit config/ai-guard.php to add, remove or update models permanently:
'pricing' => [ 'openai' => [ 'gpt-4o' => ['input' => 0.0025, 'output' => 0.01], 'gpt-4o-mini' => ['input' => 0.00015, 'output' => 0.0006], // Add new models here ], // Add new providers here ],
Budget checks use the same cost you record (per user/tenant), so multi-model costs and budgets work together.
Middleware
Route::post('/chat', ChatController::class)->middleware('ai.guard');
| Condition | Response |
|---|---|
| Over budget | 402 + JSON |
| AI disabled | 503 |
Artisan Commands
| Command | Purpose |
|---|---|
php artisan ai-guard:report |
Usage & cost report |
php artisan ai-guard:report --period=month |
Monthly report |
php artisan ai-guard:report --days=7 |
Last 7 days |
php artisan ai-guard:reset-budgets |
Reset when period ends |
php artisan ai-guard:reset-budgets --dry-run |
Preview only |
Schedule reset: $schedule->command('ai-guard:reset-budgets')->daily();
๐๏ธ Package Structure
flowchart TB
subgraph entry["Entry Points"]
F[AIGuard Facade]
M[EnforceAIBudget Middleware]
C1[ai-guard:report]
C2[ai-guard:reset-budgets]
end
subgraph core["Core"]
GM[GuardManager]
end
subgraph services["Services"]
BR[BudgetResolver]
BE[BudgetEnforcer]
TE[TokenEstimator]
CC[CostCalculator]
end
subgraph storage["Storage"]
AU[AiUsage]
AB[AiBudget]
end
F --> GM
M --> GM
C1 --> GM
C2 --> GM
GM --> BR
GM --> BE
GM --> TE
GM --> CC
BR --> AB
BE --> AB
CC --> AU
Loading
laravel-ai-guard/
โโโ src/
โ โโโ GuardManager.php # Core logic
โ โโโ Facades/AIGuard.php
โ โโโ Budget/ # BudgetResolver, BudgetEnforcer
โ โโโ Cost/ # TokenEstimator, CostCalculator
โ โโโ Models/ # AiUsage, AiBudget
โ โโโ Middleware/
โ โโโ Commands/
โ โโโ Exceptions/
โโโ database/migrations/
โโโ lang/ # 11 locales
โโโ tests/
๐ Real-World Scenarios
1. The "Safe" Chatbot ๐ค (OpenAI + Laravel AI SDK)
Goal: Build a chatbot that users can't abuse to run up a huge bill. Safety Check: Estimate cost before the request.
use Subhashladumor1\LaravelAiGuard\Facades\AIGuard; use Illuminate\Http\Request; public function chat(Request $request) { $user = auth()->user(); $prompt = $request->input('message'); // 1๏ธโฃ Run budget check (throws overflow exception if user is over limit) AIGuard::checkAllBudgets($user->id, $user->team_id); // 2๏ธโฃ Estimate cost (OpenAI/Text is roughly 4 chars/token) // If the prompt is huge (e.g. paste-bin attack), stop it here. $estimatedCost = AIGuard::estimate($prompt, 'gpt-4o', 'openai'); if ($estimatedCost > 0.50) { return response()->json(['error' => 'Message too long/expensive.'], 400); } // 3๏ธโฃ Call AI (Laravel AI SDK simple example) $response = \AI::chat($prompt); // 4๏ธโฃ Record actual usage // Tracks input, output, and updates User + Tenant budgets AIGuard::recordFromResponse($response, $user->id, $user->team_id, 'openai', 'gpt-4o', 'chatbot'); return response()->json(['reply' => $response]); }
2. Video Analysis Agent ๐ฅ (Gemini 2.5) โ Multimodal
Goal: Analyze uploaded videos. Video processing is expensive per second.
Method: Use specific keys for video_seconds or video_tokens.
// User uploads a 30-second video clip $videoPath = $request->file('video')->store('videos'); // Call Gemini API (Direct HTTP / Google Client - No Laravel SDK) $geminiResponse = Http::post('https://generativelanguage.googleapis.com/...', [ // ... payload with video data ... ]); $result = $geminiResponse->json(); // ๐ก Record complex usage: AIGuard::recordAndApplyBudget([ 'provider' => 'gemini', 'model' => 'gemini-2.5-flash', 'input_tokens' => 500, // Prompt text 'output_tokens' => 200, // Analysis text 'usage' => [ 'input_tokens' => 500, 'video_tokens_in' => 7500, // Video tokens (approx 250/sec) // OR use direct billing unit if supported: 'video_seconds' => 30 ], 'user_id' => auth()->id(), 'tag' => 'video-analysis' ]);
3. Long Document Summarizer ๐ (Claude 3.5 Sonnet + Caching)
Goal: Summarize a 100-page PDF. Reuse the PDF context for follow-up questions to save 90% cost.
Method: Track cached_input_tokens.
// 1st Call: Upload & Cache // Anthropic returns 'cache_creation_input_tokens' (write cost) AIGuard::recordAndApplyBudget([ 'provider' => 'anthropic', 'model' => 'claude-3-5-sonnet', 'input_tokens' => 50000, 'usage' => [ 'input_tokens' => 50000, 'cache_write_tokens' => 50000, // Expensive write ], 'user_id' => auth()->id(), ]); // 2nd Call: Ask question about PDF // Anthropic returns 'cache_read_input_tokens' (Cheap read! ~10% cost) AIGuard::recordAndApplyBudget([ 'provider' => 'anthropic', 'model' => 'claude-3-5-sonnet', 'input_tokens' => 50100, // 50k context + 100 new prompt 'usage' => [ 'input_tokens' => 50100, 'cached_input_tokens' => 50000, // Cheap HIT! 'output_tokens' => 500, ], // AIGuard automatically calculates the lower bill for cached tokens 'user_id' => auth()->id(), ]);
4. Background Data Processing โ๏ธ (DeepSeek / Mistral + Batch)
Goal: Process 10,000 rows of data nightly. Optimisation: Use a cheaper model (DeepSeek V3 / Mistral Small).
foreach ($rows as $row) { // Check global budget first to prevent runaway loops try { AIGuard::checkAllBudgets(null, $tenant->id); } catch (\Exception $e) { Log::alert("Budget exceeded during batch! Stopping."); break; } // Call DeepSeek API directly $response = Http::withToken($key)->post('https://api.deepseek.com/chat/completions', [ 'model' => 'deepseek-chat', 'messages' => [['role' => 'user', 'content' => "Analyze: " . $row->text]] ]); // Track it AIGuard::recordAndApplyBudget([ 'provider' => 'deepseek', 'model' => 'deepseek-chat', 'input_tokens' => $response['usage']['prompt_tokens'], 'output_tokens' => $response['usage']['completion_tokens'], 'usage' => [ 'cached_input_tokens' => $response['usage']['prompt_cache_hit_tokens'] ?? 0, ], 'tenant_id' => $tenant->id, 'tag' => 'nightly-batch' ]); }
๐ Multi-Language
11 locales: en, ar, es, fr, de, zh, hi, bn, pt, ru, ja
App locale used automatically. Customize: php artisan vendor:publish --tag=ai-guard-lang
๐ข Multi-Tenant (SaaS)
- Store
tenant_idon each usage - Set tenant budgets in config
- Middleware reads tenant from
X-Tenant-IDheader or request attribute
๐ง Beta Notice: Laravel AI Guard is currently in beta. Please report any issues with cost calculation, token estimation, or edge cases by opening a GitHub issue. Community feedback is highly appreciated.
๐งช Testing
composer install && php artisan test
๐ License
MIT. See LICENSE.
็ป่ฎกไฟกๆฏ
- ๆปไธ่ฝฝ้๏ผ 67
- ๆๅบฆไธ่ฝฝ้๏ผ 0
- ๆฅๅบฆไธ่ฝฝ้๏ผ 0
- ๆถ่ๆฐ๏ผ 25
- ็นๅปๆฌกๆฐ๏ผ 12
- ไพ่ต้กน็ฎๆฐ๏ผ 0
- ๆจ่ๆฐ๏ผ 1
ๅ ถไปไฟกๆฏ
- ๆๆๅ่ฎฎ๏ผ MIT
- ๆดๆฐๆถ้ด๏ผ 2026-02-08