承接 tivins/llm-php 相关项目开发

从需求分析到上线部署,全程专人跟进,保证项目质量与交付效率

邮箱:yvsm@zunyunkeji.com | QQ:316430983 | 微信:yvsm316

tivins/llm-php

最新稳定版本:1.11.0

Composer 安装命令:

composer require tivins/llm-php

包简介

PHP library for local LLMs

README 文档

README

Goal: Run LLM inference locally on a machine with 8 GB of VRAM or more.

Stack: Llama.cpp + a lightweight model (Gemma 2 9B, Gemma 4 4B, Qwen 2.5 7B, …).

Beyond exposing llama.cpp from PHP, llm-php adds higher-level helpers—such as "thinking"-style prompting, preset personas, and configurable tool calling: you declare tools (schemas + bound executors) and run multi-step loops until the model is done. That is not limited to ad hoc PHP callables—PredefinedTools ships ready-made workflows the model can drive (for example grep, web_search, fetch_web_page, file read/write, apply_diff, git_status, and more).

Installation

llm-php

composer require tivins/llm-php

llama.cpp

https://github.com/ggml-org/llama.cpp/blob/master/docs/install.md

apt install llama-cpp    # linux
brew install llama.cpp   # mac/linux
winget install llama.cpp # windows

API Doc : https://github.com/ggml-org/llama.cpp/blob/master/tools/server/README.md

Downloading a model (≤ 6.5 GB)

Pick a GGUF file that leaves headroom on your VRAM—the KV cache and GPU drivers use memory too. On an 8 GB VRAM card, aim for roughly about 5 to 6.5 GB for model weights in practice, depending on quantization and context length.

PHP usage

Minimal example:

First, run llama.cpp server (run.sh, run.bat).

$lama = Lama::fromServerUrl('http://127.0.0.1:8080');
$conversation = new Conversation();
$conversation->addMessage(new Message(Role::System, BehaviorPrompts::HELPFUL));
$conversation->addMessage(new Message(Role::User, 'List and briefly explain five practical habits that improve learning retention, with one short paragraph per habit (about 3–5 sentences each).'));
$answer = trim($lama->chat($conversation));

Note: This example is simplified, it does not handle exceptions and does not check whether the LLM is reachable (health).

See the examples folder for more.

Sampling and generation options (OpenAI-compatible body fields such as temperature, top_p, max_tokens, penalties, seed, stop, n) are passed via ChatCompletionOptions as an optional argument to chat(), chatCompletions(), and chatStream(). Only properties you set are sent; omitted fields keep the server defaults. See the class docblock on ChatCompletionOptions for parameter meanings and compatibility notes for local backends.

use Tivins\Llama\ChatCompletionOptions;

$sampler = new ChatCompletionOptions(temperature: 0.4, top_p: 0.9, max_tokens: 256, seed: 42);
$answer = trim($lama->chat($conversation, $sampler));

统计信息

  • 总下载量: 5
  • 月度下载量: 0
  • 日度下载量: 0
  • 收藏数: 0
  • 点击次数: 8
  • 依赖项目数: 0
  • 推荐数: 0

GitHub 信息

  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • 开发语言: PHP

其他信息

  • 授权协议: MIT
  • 更新时间: 2026-05-09

承接程序开发

PHP开发

VUE

Vue开发

前端开发

小程序开发

公众号开发

系统定制

数据库设计

云部署

网站建设

安全加固