opencat/filter-plaintext 问题修复 & 功能扩展

解决BUG、新增功能、兼容多环境部署,快速响应你的开发需求

邮箱:yvsm@zunyunkeji.com | QQ:316430983 | 微信:yvsm316

opencat/filter-plaintext

Composer 安装命令:

composer require opencat/filter-plaintext

包简介

Plain text (.txt) file filter for the OpenCAT Framework

README 文档

README

Plain text (.txt) file filter for the CAT Framework.

Installation

composer require catframework/filter-plaintext

Usage

use CatFramework\FilterPlaintext\PlainTextFilter;

$filter = new PlainTextFilter();

// Extract translatable segments
$document = $filter->extract('article.txt', 'en', 'fr');

foreach ($document->getSegmentPairs() as $pair) {
    $pair->target = new Segment('seg-t', [$translatedText]);
}

// Write the translated file
$filter->rebuild($document, 'article.fr.txt');

How segments are split

The filter splits on two or more consecutive newlines (blank-line paragraph breaks). Each non-whitespace block becomes one segment. Single newlines within a block are preserved as-is and are part of the segment text.

First paragraph.       → segment 1
                       → (separator, not a segment)
Second paragraph.      → segment 2

Third paragraph.       → segment 3

Whitespace-only blocks (e.g. multiple blank lines between paragraphs) are passed through unchanged and do not become segments.

Encoding

Input files are auto-detected as UTF-8, ISO-8859-1, or Windows-1252. All output is written in UTF-8. If encoding detection fails, the file is treated as UTF-8.

Skeleton format

[
    'parts'   => string[],      // file split by paragraph boundaries, separators included
    'seg_map' => [int => string], // parts array index => segId
]

Limitations

  • No inline markup support — the entire segment is plain text; no InlineCode elements are produced.
  • No sentence-level segmentation — each paragraph is one segment regardless of length. Use catframework/segmentation for sentence splitting.
  • Encoding detection relies on mb_detect_encoding; unusual encodings (e.g. Shift-JIS) are not supported.

统计信息

  • 总下载量: 0
  • 月度下载量: 0
  • 日度下载量: 0
  • 收藏数: 1
  • 点击次数: 9
  • 依赖项目数: 1
  • 推荐数: 0

GitHub 信息

  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • 开发语言: PHP

其他信息

  • 授权协议: MIT
  • 更新时间: 2026-05-09

承接程序开发

PHP开发

VUE

Vue开发

前端开发

小程序开发

公众号开发

系统定制

数据库设计

云部署

网站建设

安全加固