承接 wikimedia/utfnormal 相关项目开发

从需求分析到上线部署,全程专人跟进,保证项目质量与交付效率

邮箱:yvsm@zunyunkeji.com | QQ:316430983 | 微信:yvsm316

wikimedia/utfnormal

最新稳定版本:4.0.0

Composer 安装命令:

composer require wikimedia/utfnormal

包简介

Contains Unicode normalization routines, including both pure PHP implementations and automatic use of the 'intl' PHP extension when present

README 文档

README

Latest Stable Version License

utfnormal

utfnormal is a library that contains Unicode normalization routines, including both pure PHP implementations and automatic use of the 'intl' PHP extension when present.

The main function to care about is UtfNormal\Validator::cleanUp(). This will strip illegal UTF-8 sequences and characters that are illegal in XML, and if necessary convert to normalization form C.

If you know the string is already valid UTF-8, you can directly call UtfNormal\Validator::toNFC(), toNFK(), or toNFKC(); this will convert a given UTF-8 string to Normalization Form C, K, or KC if it's not already such. The function assumes that the input string is already valid UTF-8; if there are corrupt characters this may produce erroneous results.

Performance is kind of stinky in absolute terms, though it should be speedy on pure ASCII text. ;) On text that can be determined quickly to already be in NFC it's not too awful but it can quickly get uncomfortably slow, particularly for Korean text (the hangul decomposition/composition code is extra slow).

Bugs should be filed in Wikimedia's Phabricator under the "utfnormal" project.

Regenerating data tables

UtfNormalData.inc and UtfNormalDataK.inc are generated from the Unicode Character Database by the script "generate.php". Run "composer generate" to rebuild the tables. To fetch updated unicode data from the internet, run "composer generate -- --fetch".

Testing

Running "composer test" will run a syntax checker, PHPUnit conformance tests, and run some benchmarks using sample texts from Wikipedia. Take all benchmark numbers with large grains of salt.

PHP module extension

If the 'intl' PHP extension is present, ICU library functions are used which are MUCH faster than doing this work in pure PHP code.

It is strongly recommended to enable this module if possible: http://php.net/manual/en/intro.intl.php

Older versions of this library supported a one-off custom PHP extension, which has been dropped. If you were using this, please migrate to the intl extension.

History

This library was first introduced in MediaWiki 1.3 (r4965). It was split out of the MediaWiki codebase and published as an independent library during the MediaWiki 1.25 development cycle.

统计信息

  • 总下载量: 5.5M
  • 月度下载量: 0
  • 日度下载量: 0
  • 收藏数: 9
  • 点击次数: 5
  • 依赖项目数: 5
  • 推荐数: 0

GitHub 信息

  • Stars: 9
  • Watchers: 14
  • Forks: 2
  • 开发语言: PHP

其他信息

  • 授权协议: GPL-2.0-or-later
  • 更新时间: 2026-01-04

承接程序开发

PHP开发

VUE

Vue开发

前端开发

小程序开发

公众号开发

系统定制

数据库设计

云部署

网站建设

安全加固