包简介

Modern Multi-Driver Local LLM Integration for Laravel

README 文档

README

Modern, multi-driver, failover-ready local LLM integration for Laravel.

Why Laravel Local LLM SDK?

Feature	Description
Zero Cloud Dependency	Run LLMs entirely on your local infrastructure
Enterprise Ready	Type-safe, tested, and PSR-compliant
Modern PHP 8.4	Strict typing, readonly classes, union types
Production Tested	Comprehensive test suite with 100% coverage goals

Quick Links

Overview

Laravel Local LLM SDK is a modern, enterprise-grade Laravel package designed to integrate local Large Language Models (LLMs) such as Ollama and LM Studio into your Laravel applications.

Features

Multi-Driver Architecture - Support for Ollama, LM Studio, AirLLMLlama, and OpenAI-compatible local servers
Intelligent Failover - Automatic fallback to healthy drivers
Auto-Detection - Automatically detect available local LLM engines
Streaming Support - Server-Sent Events (SSE) for real-time responses
Token-Based Authentication - Built-in API token system with rate limiting
Usage Tracking - Track token usage and quotas
Builder Pattern - Fluent API for building requests
Event-Driven - Dispatch events for observability
Embeddings Support - Generate vector embeddings for semantic search
Tool Calling - Define and use tools/functions with LLMs
Batch Processing - Process multiple requests efficiently
Webhooks - Send LLM events to external services
Metrics - Prometheus-compatible metrics for monitoring
Caching - Cache model lists and health status

Requirements

PHP 8.4+
Laravel 12+
Composer 2+
Local LLM engine (Ollama, LM Studio, or OpenAI-compatible server)

Installation

composer require laravel-local-llm/sdk

Configuration

Publish the configuration file:

php artisan vendor:publish --provider="LaravelLocalLlm\LocalLlmServiceProvider" --tag="llm-config"

Environment Variables

# Default driver
LLM_DEFAULT_DRIVER=ollama

# Ollama
LLM_OLLAMA_ENABLED=true
LLM_OLLAMA_URL=http://localhost:11434
LLM_OLLAMA_DEFAULT_MODEL=llama3.2

# LM Studio
LLM_LMSTUDIO_ENABLED=true
LLM_LMSTUDIO_URL=http://localhost:1234/v1
LLM_LMSTUDIO_DEFAULT_MODEL=llama-3.2-1b-instruct

# OpenAI Compatible
LLM_OPENAI_COMPATIBLE_ENABLED=false
LLM_OPENAI_COMPATIBLE_URL=http://localhost:8080/v1

# Failover
LLM_FAILOVER_ENABLED=true

# Auto-detection
LLM_AUTO_DETECT=true

Usage

Using the Facade

use LaravelLocalLlm\Facades\LocalLlm;
use LaravelLocalLlm\DTO\Message;

// Simple chat
$response = LocalLlm::chat(
    new ChatRequest(
        model: 'llama3.2',
        messages: [
            Message::user('Hello, how are you?'),
        ]
    )
);

echo $response->content;

Using the Builder

$response = LocalLlm::chatWithBuilder()
    ->model('llama3.2')
    ->withUserMessage('Hello, how are you?')
    ->temperature(0.7)
    ->send();

echo $response->content;

Streaming

LocalLlm::chatWithBuilder()
    ->model('llama3.2')
    ->withUserMessage('Tell me a story')
    ->stream(true)
    ->sendStream(function ($chunk) {
        echo $chunk->content;
        
        if ($chunk->finished) {
            echo "\nDone!\n";
        }
    });

Using Specific Driver

$response = LocalLlm::chat(
    new ChatRequest(...),
    Driver::LM_STUDIO
);

Failover

$response = LocalLlm::chatWithFailover(new ChatRequest(...));

Checking Models

$models = LocalLlm::models();

Health Check

$isHealthy = LocalLlm::health();

Embeddings

Generate vector embeddings for semantic search:

use LaravelLocalLlm\Facades\LocalLlm;
use LaravelLocalLlm\DTO\EmbeddingRequest;

// Single text
$response = LocalLlm::embeddings(new EmbeddingRequest(
    model: 'text-embedding-3-small',
    input: 'Hello world'
));

$embedding = $response->embeddings[0]->embedding;

// Multiple texts
$response = LocalLlm::embeddings(new EmbeddingRequest(
    model: 'text-embedding-3-small',
    input: ['Hello world', 'Goodbye world']
));

Batch Processing

Process multiple chat requests efficiently:

use LaravelLocalLlm\Facades\LocalLlm;
use LaravelLocalLlm\DTO\BatchChatRequest;
use LaravelLocalLlm\DTO\ChatRequest;
use LaravelLocalLlm\DTO\Message;

$requests = [
    new ChatRequest(model: 'llama3.2', messages: [Message::user('Hello')]),
    new ChatRequest(model: 'llama3.2', messages: [Message::user('How are you?')]),
    new ChatRequest(model: 'llama3.2', messages: [Message::user('Tell me a joke')]),
];

$batchResponse = LocalLlm::batchChat(new BatchChatRequest($requests));

echo "Total requests: " . $batchResponse->count();
echo "Total tokens: " . $batchResponse->totalTokens();
echo "Avg latency: " . $batchResponse->averageLatencyMs() . "ms";

Token Authentication

Creating Tokens

use LaravelLocalLlm\Models\LlmToken;
use Illuminate\Support\Facades\Hash;

$token = LlmToken::create([
    'name' => 'API Token',
    'hashed_token' => Hash::make('your-secret-token'),
    'abilities' => ['chat', 'stream'],
    'rate_limit' => 60,
    'monthly_quota' => 1000000,
]);

Using Tokens

Include the token in your request:

curl -H "Authorization: Bearer your-secret-token" \
  https://your-app.com/api/llm/chat

Middleware

Protect your routes:

Route::middleware(['llm.guard:chat,stream'])->group(function () {
    Route::post('/llm/chat', [LlmController::class, 'chat']);
});

Events

ChatCompleted - Dispatched when a chat request completes
StreamChunkReceived - Dispatched for each streaming chunk

Event::listen(\LaravelLocalLlm\Events\ChatCompleted::class, function ($event) {
    log::info('Chat completed', [
        'model' => $event->response->model,
        'latency' => $event->response->latencyMs,
    ]);
});

Webhooks

Send LLM events to external services:

use LaravelLocalLlm\Webhooks\WebhookDispatcher;

$webhook = new WebhookDispatcher();

$webhook->register('chat.completed', 'https://your-app.com/webhooks/llm', [
    'secret' => env('WEBHOOK_SECRET'),
]);

$webhook->dispatchChatCompleted($request, $response, $driver);

Metrics

Track LLM usage with Prometheus-compatible metrics:

use LaravelLocalLlm\Services\Metrics;

$metrics = new Metrics();

$metrics->recordRequest('ollama', 'llama3.2', 150.5, 20, 50);
$metrics->recordRequest('ollama', 'llama3.2', 120.0, 15, 45);

// Get per-model metrics
$allMetrics = $metrics->getMetrics();

// Get aggregate metrics
$aggregate = $metrics->getAggregateMetrics();
// ['total_requests' => 2, 'avg_latency_ms' => 135.25, ...]

// Export to Prometheus format
$prometheus = $metrics->toPrometheusFormat();

Helpers

Utility functions for common tasks:

use LaravelLocalLlm\Helpers\TokenCalculator;
use LaravelLocalLlm\Helpers\ResponseFormatter;

// Estimate tokens
$tokens = TokenCalculator::estimateTokens('Hello world');

// Calculate cost
$cost = TokenCalculator::calculateCost(100, 50, 0.001, 0.002);

// Format response
$html = ResponseFormatter::markdown('**bold** and *italic*');

// Extract code blocks
$codeBlocks = ResponseFormatter::extractCode($markdown);

Console Commands

# Check driver health
php artisan llm:health

# Check specific driver
php artisan llm:health --driver=ollama

# List models
php artisan llm:models

# List models for specific driver
php artisan llm:models --driver=lmstudio

# Clear cache
php artisan llm:clear-cache

Extending

Custom Driver

use LaravelLocalLlm\Contracts\DriverInterface;
use LaravelLocalLlm\Enums\Driver;
use LaravelLocalLlm\DTO\ChatRequest;
use LaravelLocalLlm\DTO\ChatResponse;

class CustomDriver implements DriverInterface
{
    public function getDriver(): Driver
    {
        return Driver::OLLAMA; // or new Driver('custom')
    }

    public function chat(ChatRequest $request): ChatResponse
    {
        // Implementation
    }

    public function stream(ChatRequest $request, callable $onChunk): void
    {
        // Implementation
    }

    public function models(): array
    {
        // Implementation
    }

    public function health(): bool
    {
        // Implementation
    }

    public function isEnabled(): bool
    {
        return true;
    }
}

Testing

composer test

License

MIT License - see LICENSE for details.

laravel-local-llm/sdk 适用场景与选型建议

laravel-local-llm/sdk 是一款基于 PHP 开发的 Composer 扩展包，目前已累计 0 次下载、GitHub Stars 达 0，最近一次更新时间为 2026 年 03 月 08 日，在 PHP 生态内属于活跃度较高的组件。

它主要适用于以下技术方向：「laravel」「ai」「openai」「llm」「ollama」「lm-studio」等业务场景。在实际项目中，围绕这些方向常见需要落地的问题包括：接口对接、性能调优、并发安全、与既有框架（Laravel / ThinkPHP / Yii / Webman 等）的兼容适配，以及生产环境的日志埋点与稳定性保障。

我们在过去多个企业项目中使用过 laravel-local-llm/sdk 或与其功能相近的方案，如果你在选型或落地过程中遇到问题，例如 版本兼容、二次改造、私有化封装、与内部系统对接、生产 BUG 排查，欢迎联系我们协助评估。

围绕 laravel-local-llm/sdk 我们能提供哪些服务？

定制开发 / 二次开发

基于 laravel-local-llm/sdk 在你已有业务上做功能扩展、字段裁剪、UI 适配、与内部账号 / 权限 / 日志系统的深度对接。

BUG 修复 & 性能优化

线上偶发问题、内存泄漏、慢查询、并发异常等排查修复；针对高流量场景做缓存、队列、索引层面的调优。

项目外包 & 长期维护

承接完整的项目从需求 → 设计 → 开发 → 上线 → 长期运维；也可按月提供技术保姆服务。

yvsm@zunyunkeji.com QQ：316430983 微信：yvsm316 西安尊云信息科技 · 专注 PHP / Go / 分布式系统研发

与 laravel-local-llm/sdk 相关的其它包

同方向 / 同关键字的高下载量 PHP Composer 包推荐，方便对比选型：

sourceability/openai-client 18

PHP 8.0+ OpenAI API client with fully typed/documented requests+responses models, guzzle and symfony/http-client support and async/parallel requests.

vizra-ai/ai-tokens 2

Estimate AI API costs before making expensive calls

iteks/laravel-openai 8

A powerful package that seamlessly integrates OpenAI's advanced AI capabilities into your Laravel applications. This package offers quick setup and intuitive configuration to leverage AI models for chat, embeddings, and more.

datlechin/flarum-ai-summarize 1

AI discussion summaries for Flarum with real-time streaming.

foxtes/alfabank-rest 1

Alfabank REST API integration

aisdk/azure 0

Official Azure OpenAI provider for the PHP AI SDK.

laravel-local-llm/sdk

包简介

关键字：