承接 opencat/workflow 相关项目开发

从需求分析到上线部署,全程专人跟进,保证项目质量与交付效率

邮箱:yvsm@zunyunkeji.com | QQ:316430983 | 微信:yvsm316

opencat/workflow

Composer 安装命令:

composer require opencat/workflow

包简介

Pipeline orchestration for the OpenCAT Framework — wires filter, segmentation, TM, MT, QA, and XLIFF into a single WorkflowRunner

README 文档

README

Pipeline orchestration for the OpenCAT Framework.

WorkflowRunner wires filter, segmentation, TM, terminology, MT, QA, and XLIFF output into a single process() call. ProjectWorkflowBuilder constructs a fully configured runner from a ProjectManifest with no manual wiring.

Installation

composer require opencat/workflow

Quick start — from a project manifest

use CatFramework\Project\ProjectLoader;
use CatFramework\Workflow\FileFilterRegistry;
use CatFramework\Workflow\ProjectWorkflowBuilder;
use CatFramework\FilterDocx\DocxFilter;
use CatFramework\FilterPlaintext\PlainTextFilter;

$manifest = ProjectLoader::load('catproject.json');
$registry = new FileFilterRegistry();
$registry->register(new DocxFilter());
$registry->register(new PlainTextFilter());

$runner = (new ProjectWorkflowBuilder($manifest))->build('fr-FR', $registry);
$result = $runner->process('report.docx', 'fr-FR');

echo "Exact TM: {$result->matchStats->exact}" . PHP_EOL;
echo "Fuzzy TM: {$result->matchStats->fuzzy}" . PHP_EOL;
echo "MT filled: {$result->matchStats->mt}"   . PHP_EOL;
echo "XLIFF: {$result->xliffPath}"            . PHP_EOL;

Manual wiring

Build WorkflowRunner directly when you need finer control:

use CatFramework\Workflow\WorkflowRunner;
use CatFramework\Workflow\WorkflowOptions;
use CatFramework\Workflow\FileFilterRegistry;
use CatFramework\Segmentation\SrxSegmentationEngine;
use CatFramework\Xliff\XliffWriter;
use CatFramework\TranslationMemory\SqliteTranslationMemory;
use CatFramework\Mt\DeepL\DeepLAdapter;
use CatFramework\Qa\QualityRunner;

$options = WorkflowOptions::defaults();
$options->mtFillThreshold    = 0.75;   // use MT when best TM match < 75%
$options->autoConfirmThreshold = 1.0;  // auto-lock only exact TM matches
$options->autoWriteToTm      = true;   // feed MT output back into TM
$options->writeXliff         = true;
$options->qaFailOnSeverity   = 'error';

$runner = new WorkflowRunner(
    fileFilterRegistry: $registry,
    segmentationEngine: new SrxSegmentationEngine(),
    xliffWriter: new XliffWriter(),
    sourceLang: 'en-US',
    translationMemory: new SqliteTranslationMemory($pdo),
    mtAdapter: $deepLAdapter,
    qaRunner: $qaRunner,
    options: $options,
);

$result = $runner->process('report.docx', 'fr-FR');

Pipeline steps

WorkflowRunner::process() executes these steps in order:

Step What happens
1. Extract FileFilterRegistry selects the correct filter and calls extract()
2. Segment SrxSegmentationEngine splits multi-sentence structural units into individual sentences
3a. TM lookup Looks up each segment; auto-locks exact matches; marks fuzzy matches as Draft
3b. Terminology Calls TerminologyProvider::recognize() for timing and future highlight data
3c. MT fill For segments below $mtFillThreshold, calls the MT adapter
3d. Persist Stores each SegmentPair to SegmentStore if configured
3e. TM write-back If $autoWriteToTm, stores each translated pair back into TM
4. QA Runs all registered QA checks; throws WorkflowException if $qaFailOnSeverity is hit
5. XLIFF output Writes {source}.xlf + {source}.xlf.skl to $outputDir (if $writeXliff)
6. Skeleton store Persists skeleton to SkeletonStore if configured

Progress callback

Get notified after each segment is processed:

$runner->onSegmentProcessed(function ($pair, int $index, int $total) {
    echo "  [{$index}/{$total}] {$pair->source->getPlainText()}" . PHP_EOL;
});

WorkflowResult

process() returns a WorkflowResult:

$result->document;            // BilingualDocument with all segment pairs
$result->qaIssues;            // QualityIssue[]
$result->matchStats->exact;   // count of exact TM matches
$result->matchStats->fuzzy;   // count of fuzzy TM matches
$result->matchStats->mt;      // count of MT-filled segments
$result->matchStats->unmatched; // count with no TM or MT fill
$result->xliffPath;           // path to written XLIFF (null if writeXliff=false)
$result->storeFileId;         // UUID used as key in SegmentStore/SkeletonStore
$result->timings;             // ['extract', 'segment', 'tm', 'terminology', 'mt', 'qa', 'xliff', 'store']

FileFilterRegistry

Filters are selected by calling supports() on each registered filter in registration order. The first filter that returns true is used.

$registry = new FileFilterRegistry();
$registry->register(new DocxFilter());
$registry->register(new HtmlFilter());
$registry->register(new PlainTextFilter());  // fallback for .txt

$filter = $registry->getFilter('report.docx');   // returns DocxFilter

getFilter() throws WorkflowException if no filter supports the file.

MT fill threshold

$mtFillThreshold controls when MT kicks in:

  • 0.0 (default) — MT never runs
  • 0.75 — MT runs when the best TM match is below 75%
  • 1.0 — MT fills any segment without an exact TM match

Related packages

统计信息

  • 总下载量: 0
  • 月度下载量: 0
  • 日度下载量: 0
  • 收藏数: 1
  • 点击次数: 8
  • 依赖项目数: 0
  • 推荐数: 0

GitHub 信息

  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • 开发语言: PHP

其他信息

  • 授权协议: MIT
  • 更新时间: 2026-05-09

承接程序开发

PHP开发

VUE

Vue开发

前端开发

小程序开发

公众号开发

系统定制

数据库设计

云部署

网站建设

安全加固