承接 interitty/tokenizer 相关项目开发

从需求分析到上线部署,全程专人跟进,保证项目质量与交付效率

邮箱:yvsm@zunyunkeji.com | QQ:316430983 | 微信:yvsm316

interitty/tokenizer

最新稳定版本:v1.0.10

Composer 安装命令:

composer require interitty/tokenizer

包简介

Use regular expressions to split a given string into tokens.

README 文档

README

Use regular expressions to split a given string into tokens.

Requirements

Installation

The best way to install interitty/tokenizer is using Composer:

composer require interitty/tokenizer

Tokenizer usage

The tokenization process needs the definition of a map (from token regexes to token classes) and string to be tokenized. A simple tokenizer that separates strings into numbers, whitespaces, and letters can look like the following code.

$tokenizer = new Tokenizer('say 123');
$tokenizer->map = [
    'number' => '~^\d+~',
    'whitespace' => '~^\s+~',
    'string' => '~^\w+~'
];

Processing the tokens

Tokens can be accessed by iterating thru the next and current methods until the TOKEN_END appears.

$tokens = [];
do {
    $token = $tokenizer->next();
    $tokens[] = $token;

    assert($token === $tokenizer->current());
} while ($token->getType() !== Token::TOKEN_END);

The resulting array of $tokens would look like the following.

[
    new Token('string', 'say', 1, 1),
    new Token('whitespace', ' ', 1, 4),
    new Token('number', '123', 1, 5),
]

Skipping unnecessary tokens

In some cases, it may be useful to automatically skip some tokens and move on to others. Because of that, there are addSkippedTokenType and setSkippedTokenTypes methods. The TOKEN_END token can't be skipped.

$tokenizer->addSkippedTokenType('whitespace');

$string = '';
do {
    $token = $tokenizer->next();
    $string .= $token->getValue();
} while ($token->getType() !== Token::TOKEN_END);
assert('say123' === $string);

Expecting tokens

The tokenizer includes a helper to expect the correct token type and value. This can simplify and unify the checking process.

$tokenizer = new Tokenizer('{some coed}');
$tokenizer->map = [
    'brackets' => '~^[{}]~',
    'code' => '~^[^{}]+~',
];

$tokenizer->expect($tokenizer->next(), 'brackets', '{');
$tokenizer->expect($tokenizer->next(), 'code');

$code = $tokenizer->current()->getValue();

$tokenizer->expect($tokenizer->next(), 'brackets', '}');

BaseTokenizerParser usage

The possible way for using a Tokenizer is in the BaseTokenizerParser which provides the functionality of parsing the given string into a stream of tokens. It can be useful for validating that a given string is compatible with the expected grammar and for parsing him into a structured array.

This functionality is used in the interitty/pacc.

BaseParser usage

In the case where it can be needed to work with own implementation of Tokenizer, there is a BaseParser abstract class that allows implementing own logic of work with current and next Token and own mechanism of work with the tokenType and tokenLexeme.

统计信息

  • 总下载量: 52
  • 月度下载量: 0
  • 日度下载量: 0
  • 收藏数: 2
  • 点击次数: 3
  • 依赖项目数: 1
  • 推荐数: 0

GitHub 信息

  • Stars: 1
  • Watchers: 0
  • Forks: 0
  • 开发语言: PHP

其他信息

  • 授权协议: MIT
  • 更新时间: 2022-08-26

承接程序开发

PHP开发

VUE

Vue开发

前端开发

小程序开发

公众号开发

系统定制

数据库设计

云部署

网站建设

安全加固