定制 tecnickcom/tc-lib-unicode 二次开发

按需修改功能、优化性能、对接业务系统,提供一站式技术支持

邮箱:yvsm@zunyunkeji.com | QQ:316430983 | 微信:yvsm316

tecnickcom/tc-lib-unicode

最新稳定版本:2.2.0

Composer 安装命令:

composer require tecnickcom/tc-lib-unicode

包简介

PHP library containing Unicode methods

README 文档

README

UTF-8 and Unicode processing utilities, including bidirectional text handling.

Latest Stable Version Build Coverage License Downloads

Sponsor on GitHub

If this project is useful to you, please consider supporting development via GitHub Sponsors.

Overview

tc-lib-unicode provides Unicode conversion helpers and bidirectional algorithm support for robust multilingual text processing.

It is built to handle multilingual text paths where normalization, code-point handling, and bidirectional ordering directly affect rendering quality. By isolating Unicode-heavy operations, dependent libraries can keep text processing accurate and easier to audit.

Namespace \Com\Tecnick\Unicode
Author Nicola Asuni info@tecnick.com
License GNU LGPL v3 - see LICENSE
API docs https://tcpdf.org/docs/srcdoc/tc-lib-unicode
Packagist https://packagist.org/packages/tecnickcom/tc-lib-unicode

Features

Unicode Utilities

  • UTF-8 character and ordinal conversion helpers
  • String/character array transformations
  • Integration-ready conversion methods for document engines

Bidirectional Support

  • Unicode Bidirectional Algorithm implementation
  • Right-to-left and mixed-direction text processing
  • Supporting shaping/step logic for complex scripts

Character Substitution

  • Context-sensitive codepoint-level substitution via Substitution::replaceChars()
  • Thai — repositions leading vowels (Sara E/AE/O/AI, U+0E40–U+0E44, U+0E4D) to follow their base consonant, matching PDF visual-order glyph streams
  • Devanagari — moves left-positional matras (U+093F) to precede their base consonant cluster, including conjuncts joined by Virama (U+094D)
  • Hangul — composes Hangul Jamo sequences (U+1100–U+11FF, U+A960–U+A97F, U+D7B0–U+D7FF) into precomposed syllables (U+AC00–U+D7A3) per Unicode Standard §3.12

Requirements

  • PHP 8.2 or later
  • Extensions: mbstring, pcre
  • Composer

Installation

composer require tecnickcom/tc-lib-unicode

Quick Start

<?php

require_once __DIR__ . '/vendor/autoload.php';

$bidi = new \Com\Tecnick\Unicode\Bidi('hello ', null, null, 'R', false);
echo $bidi->getString();

Character substitution

Substitution::replaceChars() takes an array of Unicode codepoints and returns a transformed array with script-specific substitutions applied. It is a pure codepoint-level transform with no font or PDF dependency.

<?php

require_once __DIR__ . '/vendor/autoload.php';

$sub = new \Com\Tecnick\Unicode\Substitution();

// Thai: leading vowel repositioned after its base consonant
// Logical order:  [U+0E40 SARA E, U+0E01 KO KAI]
// Visual order:   [U+0E01 KO KAI, U+0E40 SARA E]
$result = $sub->replaceChars([0x0E40, 0x0E01]);
// $result === [0x0E01, 0x0E40]

// Devanagari: left matra repositioned before its base consonant cluster
// Logical order:  [U+0915 KA, U+093F VOWEL SIGN I]
// Visual order:   [U+093F VOWEL SIGN I, U+0915 KA]
$result = $sub->replaceChars([0x0915, 0x093F]);
// $result === [0x093F, 0x0915]

// Hangul: Jamo composed into a precomposed syllable
// [U+1100 KIYEOK, U+1161 JUNGSEONG A, U+11A8 JONGSEONG KIYEOK] → [U+AC01 각]
$result = $sub->replaceChars([0x1100, 0x1161, 0x11A8]);
// $result === [0xAC01]

Supported scripts and Unicode ranges

Script Unicode range(s) Transformation
Thai U+0E00–U+0E7F Leading vowels repositioned after base consonant
Devanagari U+0900–U+097F Left matras repositioned before consonant cluster
Hangul Jamo U+1100–U+11FF, U+A960–U+A97F, U+D7B0–U+D7FF Jamo composed to precomposed syllables (U+AC00–U+D7A3)

Codepoints belonging to unsupported scripts are passed through unchanged.

Development

make deps
make help
make qa

Packaging

make rpm
make deb

For system packages, bootstrap with:

require_once '/usr/share/php/Com/Tecnick/Unicode/autoload.php';

Contributing

Contributions are welcome. Please review CONTRIBUTING.md, CODE_OF_CONDUCT.md, and SECURITY.md.

Contact

Nicola Asuni - info@tecnick.com

统计信息

  • 总下载量: 696.17k
  • 月度下载量: 0
  • 日度下载量: 0
  • 收藏数: 10
  • 点击次数: 5
  • 依赖项目数: 2
  • 推荐数: 1

GitHub 信息

  • Stars: 10
  • Watchers: 2
  • Forks: 7
  • 开发语言: PHP

其他信息

  • 授权协议: LGPL-3.0-or-later
  • 更新时间: 2015-09-12

承接程序开发

PHP开发

VUE

Vue开发

前端开发

小程序开发

公众号开发

系统定制

数据库设计

云部署

网站建设

安全加固