定制 codechap/yii3-context-trimmer 二次开发

按需修改功能、优化性能、对接业务系统,提供一站式技术支持

邮箱:yvsm@zunyunkeji.com | QQ:316430983 | 微信:yvsm316

codechap/yii3-context-trimmer

最新稳定版本:1.0.0

Composer 安装命令:

composer require codechap/yii3-context-trimmer

包简介

Tokenizer-agnostic text preprocessor for trimming and optimising content for LLM context windows.

README 文档

README

Tokenizer-agnostic text preprocessor for trimming and optimising content to fit within LLM context windows. Built for Yii3 with full DI container integration, configurable params, and a console command.

Requirements

  • PHP 8.2 - 8.5

Installation

composer require codechap/yii3-context-trimmer

For Yii3 applications using the config plugin, the DI bindings and params are registered automatically.

Usage

Via Dependency Injection (Yii3)

Inject the interface to get a pre-configured trimmer from the DI container:

use Codechap\Yii3ContextTrimmer\ContextTrimmerInterface;

final class MyService
{
    public function __construct(
        private readonly ContextTrimmerInterface $trimmer,
    ) {}

    public function process(string $text): array
    {
        return $this->trimmer
            ->withMaxTokens(4096)
            ->withRemoveDuplicateLines(true)
            ->trim($text);
    }
}

Default configuration is handled via params.php — see Configuration below.

Standalone

use Codechap\Yii3ContextTrimmer\ContextTrimmer;

$trimmer = new ContextTrimmer();

$segments = $trimmer
    ->withMaxTokens(4096)
    ->withRemoveDuplicateLines(true)
    ->trim($longText);

Custom Tokenizer

The default tokenizer splits on spaces, which is a rough heuristic. For accurate token counting, provide a tokenizer matching your LLM's tokenization:

// Example: tiktoken-based tokenizer for OpenAI models
$trimmer = new ContextTrimmer(
    tokenizer: function (string $text): array {
        return your_tiktoken_encode($text);
    },
);

Configuration

Yii3 Params

Override defaults in your application's params.php:

return [
    'codechap/yii3-context-trimmer' => [
        'maxTokens' => 4096,           // Max tokens per segment (default: 8192)
        'removeDuplicateLines' => true, // Remove duplicate lines (default: false)
        'removeShortWords' => false,    // Remove short words (default: false)
        'minWordLength' => 2,           // Min word length to keep (default: 2)
        'removeExtraneous' => false,    // Remove brackets/parens/etc (default: false)
        'compressWhitespace' => true,   // Compress whitespace (default: true)
    ],
];

Options Reference

Option Default Description
maxTokens 8192 Maximum tokens per output segment. Must be >= 2.
removeDuplicateLines false Remove duplicate non-blank lines. Blank lines are preserved as structural separators.
removeShortWords false Remove purely-alphabetical words shorter than minWordLength. Warning: This is aggressive and removes articles, prepositions, and pronouns.
minWordLength 2 Minimum word length to keep. Words shorter than this are removed.
removeExtraneous false Remove [](){}<>* characters. Warning: Destroys Markdown, HTML, and code syntax.
compressWhitespace true Collapse multiple whitespace characters into single spaces. Disable for code/formatted text.

Console Command

Requires yiisoft/yii-console for the Yii3 console runner.

# Trim a file
./yii context:trim path/to/file.txt

# Pipe from stdin
cat document.txt | ./yii context:trim

# With options
./yii context:trim file.txt --max-tokens 4096 --remove-duplicates --json

# All options
./yii context:trim file.txt \
    -t 4096 \
    -d \                     # --remove-duplicates
    -s \                     # --remove-short-words
    -l 3 \                   # --min-word-length
    -x \                     # --remove-extraneous
    --no-compress \
    -j                       # --json output

Testing

composer test

Static Analysis

composer analyse

License

MIT

统计信息

  • 总下载量: 6
  • 月度下载量: 0
  • 日度下载量: 0
  • 收藏数: 0
  • 点击次数: 6
  • 依赖项目数: 0
  • 推荐数: 0

GitHub 信息

  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • 开发语言: PHP

其他信息

  • 授权协议: MIT
  • 更新时间: 2026-03-13

承接程序开发

PHP开发

VUE

Vue开发

前端开发

小程序开发

公众号开发

系统定制

数据库设计

云部署

网站建设

安全加固