flo-labs/dms
最新稳定版本:0.5.1
Composer 安装命令:
composer require flo-labs/dms
包简介
PHP parser for DMS — a data syntax with strong typing, ordered maps, multi-line heredocs, and front-matter metadata.
README 文档
README
dms-php
PHP parser for DMS, a data syntax
with strong typing, ordered maps, multi-line heredocs, and front-matter
metadata. This is the pure-PHP port — no extensions beyond ext-intl,
no FFI, no native build step. Six other ports (Rust, C, Zig, Go,
Python, JavaScript, Perl) check against the same fixture corpus, so a
document that parses here parses identically everywhere.
What DMS looks like
A medium-size tier-0 document, exercising every feature you'd touch in
a real config — front matter, comments (line + trailing), nested
tables, list-of-tables with the + marker, flow forms, distinct types,
and a heredoc with a trim modifier:
+++
title: "DMS feature tour"
version: "1.0.0"
updated: 2026-04-24T09:30:00-04:00
+++
# Hash and // line comments both work.
// Bare keys allow full Unicode; quoted keys take any string.
database:
host: "db.internal"
port: 5432 # bumped after the LB change
pool: { size: 10, idle_timeout_s: 30 } # flow table
servers:
+ name: "web1"
disks:
+ mount: "/"
size_gb: 100
+ mount: "/var"
size_gb: 500
+ name: "web2"
regions: ["us-east-1", "eu-west-1", "ap-south-1"]
sql: """SQL _trim("\n", ">")
SELECT id, email
FROM users
WHERE active = true
SQL
Tier 1 layers structured decorators on top of the value tree. Sigils
bind to families published by a dialect; here is dms+html carrying
an HTML fragment as a DMS document:
+++
_dms_tier: 1
_dms_imports:
+ dialect: "html"
version: "1.0.0"
+++
+ |html(lang: "en")
+ |head
+ |title "DMS feature tour"
+ |meta(charset: "UTF-8")
+ |body(class: "main")
+ |h1 "Welcome to DMS"
+ |p(class: "lede")
+ "Click "
+ |a(href: "/spec.html") "here"
+ " to read the spec."
Full feature tour, format comparison, and dialect index on the DMS website.
Install
composer require flo-labs/dms
Or pin in composer.json:
{
"require": {
"flo-labs/dms": "^0.5"
}
}
The package is pure PHP — composer install does no build step. A
companion FFI binding (flo-labs/dms-c) wraps the C parser for hot
paths; same public API and same value shape, ~7× faster on large
documents. Use the pure package on shared hosting or wherever
php_ffi is unavailable.
Quick start
<?php
require 'vendor/autoload.php';
use Dms\Parser;
use Dms\Emitter;
$src = file_get_contents('config.dms');
// Full document — preserves comments + literal forms for round-trip emit.
$doc = Parser::decode($src);
$meta = $doc->meta; // Dms\Table | null
$body = $doc->body; // Dms\Table | array | scalar | datetime wrapper
$comments = $doc->comments; // list<Dms\AttachedComment>
$originalForms = $doc->originalForms; // list<[path, Dms\OriginalLiteral]>
// Read a deep value (Table implements ArrayAccess).
$port = $doc->body['database']['port'];
// Document is immutable — clone with a new body via withBody().
$body = $doc->body;
$body['database']['port'] = 5432;
$doc = $doc->withBody($body);
// Re-emit DMS source.
echo Emitter::encode($doc);
Front-matter-only decode
For callers that need only the document's metadata — config loaders
checking _dms_tier, indexers harvesting user keys, dispatchers
choosing a downstream decoder — Parser::decodeFrontMatter parses the
+++ ... +++ block and stops, leaving body bytes untokenized. SPEC
tier 0 requires this entry point.
$meta = Parser::decodeFrontMatter($src);
if ($meta === null) {
// No `+++` block. Empty array means present-but-empty FM.
} else {
$title = $meta['title'] ?? null;
}
Public API
PSR-4 autoload root Dms\ → src/. Everything below ships in the
flo-labs/dms package.
Tier-0 entry points
| Symbol | Purpose |
|---|---|
Dms\Parser::decode(string): Document | Full document, preserves comments + original forms for round-trip |
Dms\Parser::decodeLite(string): Document | Skip comment / original-form bookkeeping; faster, no round-trip |
Dms\Parser::decodeUnordered(string): Document | Tables backed by Dms\UnorderedTable (HashMap-style, no order) |
Dms\Parser::decodeLiteUnordered(string): Document | Combined: lite + unordered |
Dms\Parser::decodeFrontMatter(string): ?array | FM-only; stops at closing +++ |
Dms\Parser::isBareKey(string): bool | UAX #31 bare-key membership check |
Dms\Emitter::encode(Document): string | Round-trip emit; throws EncodeException on UnorderedTable |
Dms\Emitter::encodeLite(Document): string | Canonical emit; accepts unordered, never throws |
Dms\TaggedJsonEncoder::encode(Document): string | Conformance-suite tagged-JSON encoder (not DMS source) |
The deprecated 0.2.x names Parser::parse / Parser::parseLite /
Emitter::toDms / Emitter::toDmsLite still work and emit
E_USER_DEPRECATED; they will be removed in a future release.
Tier-1 (decorators + dialects)
| Symbol | Purpose |
|---|---|
Dms\Tier1Decoder::decode(string): Tier1Document | Decode a tier-1 source into body + decorators sidecar |
Dms\Tier1JsonEncoder | Tagged-JSON for tier-1 conformance fixtures |
Tier1Document carries tier, imports (ImportSpec[]), body
(tier-0 value tree), and decorators (DecoratorEntry[] keyed by
breadcrumb path).
Value types
| Class | Role |
|---|---|
Dms\Document | Root container — meta, body, comments, originalForms |
Dms\Table | Insertion-ordered map; ArrayAccess + IteratorAggregate + Countable |
Dms\UnorderedTable | HashMap-style map; emitted only via encodeLite |
Dms\LocalDate | YYYY-MM-DD |
Dms\LocalTime | HH:MM:SS[.fff] |
Dms\LocalDateTime | Date + time, no offset |
Dms\OffsetDateTime | Date + time + zone offset |
Dms\ValueType | Enum-style helper for typed dispatch |
Dms\Capabilities | Reports compile-time toggles |
Comments + round-trip metadata
| Class | Role |
|---|---|
Dms\Comment | Single comment text + kind (line / block) |
Dms\AttachedComment | Comment + position (leading / inner / trailing / floating) + path |
Dms\OriginalLiteral | Per-node literal form override (heredoc flavor, integer base, …) |
Dms\StringForm | String-shape descriptor used by OriginalLiteral |
Dms\HeredocFlavor | basicTriple / literalTriple |
Dms\HeredocModifierCall | _trim(...), _fold_paragraphs(), etc. |
Errors
| Class | Thrown by |
|---|---|
Dms\DecodeException | Parser::decode* on malformed source — carries dmsLine / dmsColumn |
Dms\EncodeException | Emitter::encode when source is not safely re-emittable in full mode |
Dms\ParseError | Deprecated 0.2.x alias — subclass of DecodeException |
Value / type mapping
| DMS type | PHP value |
|---|---|
| bool | bool |
| integer | int (64-bit on x64; matches the i64 SPEC range) |
| float | float |
| string | string (UTF-8, NFC-normalized) |
| local-date | Dms\LocalDate |
| local-time | Dms\LocalTime |
| local-datetime | Dms\LocalDateTime |
| offset-datetime | Dms\OffsetDateTime |
| table | Dms\Table (ordered) or Dms\UnorderedTable (with decodeUnordered) |
| list | array (zero-indexed; array_is_list($v) === true) |
Datetime variants carry the SPEC-validated source lexeme as a string,
so callers never re-parse to inspect them. Table distinguishes
itself from a list at the type level — the emitter does an O(1)
instanceof instead of array_is_list() on every node.
Error handling
DecodeException extends \RuntimeException. The constructor injects
the line:column: prefix into the message, and the typed location is
exposed as readonly properties:
use Dms\Parser;
use Dms\DecodeException;
try {
$doc = Parser::decode($src);
} catch (DecodeException $e) {
fwrite(STDERR, sprintf(
"decode failed at %d:%d — %s\n",
$e->dmsLine,
$e->dmsColumn,
$e->getMessage(),
));
}
$e->line / $e->column accessor synonyms also work (PHP's
\Exception already owns a non-readonly $line, so the canonical
fields are dmsLine / dmsColumn).
EncodeException extends \RuntimeException and is currently raised
by Emitter::encode when the Document carries an UnorderedTable
(no stable key order ⇒ no round-trip). Use Emitter::encodeLite for
that case — lite emit accepts unordered input and never throws.
Catching \RuntimeException (or \Throwable) sees both directions in
one block.
PHP version compatibility
require.php >= 8.1. The codebase uses readonly properties, named
arguments, intersection types, and mixed — all PHP 8.1 features.
ext-intl is required for NFC normalization of source bytes.
Tested against PHP 8.1, 8.2, 8.3, 8.4. MSRV bumps will be called out in release notes.
Working with comments and heredocs
DMS preserves comments through decode → mutate → re-emit (SPEC
§Comments). Dms\Document is immutable: its meta, body,
comments, and originalForms fields are readonly. To attach a
comment programmatically, build the new comments list and use the
withComments clone helper (PSR-7 style):
use Dms\Parser;
use Dms\Emitter;
use Dms\AttachedComment;
use Dms\Comment;
$doc = Parser::decode("db:\n port: 8080\n");
$comments = [...$doc->comments,
new AttachedComment(
comment: new Comment(content: '# bumped after LB change', kind: 'line'),
position: 'leading',
path: ['db', 'port'],
),
];
echo Emitter::encode($doc->withComments($comments));
Forcing a heredoc on emit
Strings parse and re-emit in their source form. To switch a
basic-quoted string to a heredoc (or to construct one from scratch),
append an OriginalLiteral::string(...) record keyed by the value's
path and swap it in via withOriginalForms:
use Dms\OriginalLiteral;
use Dms\StringForm;
use Dms\HeredocFlavor;
$forms = [...$doc->originalForms,
[
['db', 'greeting'],
OriginalLiteral::string(StringForm::heredoc(
HeredocFlavor::basicTriple(), // or ::literalTriple() for '''
null, // null = unlabeled
[], // _trim(...), _fold_paragraphs(), …
)),
],
];
$doc = $doc->withOriginalForms($forms);
Round-trip rules (SPEC §Round-trip semantics): comments stick to
still-present nodes; deleting a node drops its comments; newly
inserted nodes start with no comments. The first originalForms
entry per path wins, so override a parser-recorded form by replacing
rather than appending if the key is already present.
When to use which decoder
| Goal | Entry point |
|---|---|
| Read config, no re-emit | Parser::decodeLite |
| Read + re-emit, preserving comments / heredoc form | Parser::decode + Emitter::encode |
| Read only the FM block (dispatch, schema check, index) | Parser::decodeFrontMatter |
| Tier-1 source (decorators, dialect imports) | Tier1Decoder::decode |
| Speed over round-trip fidelity | Parser::decodeLite |
| Don't care about table order | Parser::decodeUnordered |
| Combined fast-path (no round-trip + no order tracking) | Parser::decodeLiteUnordered |
Conformance
The fixture corpus lives in dms-tests (4500+ pairs). Clone it once as a sibling:
cd ..
git clone https://gitlab.com/flo-labs/pub/dms-tests.git
The bin/dms-encoder binary (shipped in this package's bin) reads
DMS from stdin and writes tagged JSON to stdout, matching the format
the conformance runner consumes:
composer install
python3 ../dms-tests/run_conformance.py vendor/bin/dms-encoder
Behavioural drift between ports is caught at the conformance gate, not at runtime.
Build & test
composer install # pulls phpunit, symfony/yaml, phpbench, yosymfony/toml
vendor/bin/phpunit # full test suite
vendor/bin/phpbench run # decoder benchmarks
Companion projects
| Package | Purpose |
|---|---|
flo-labs/dms-c | php_ffi binding to the C parser; ~7× faster, same API |
dms-rs | Canonical Rust reference parser |
dms-py | Python reference port |
| dms-tests | Cross-language fixture corpus (4500+ pairs) |
| DMS website | Spec, format comparison, dialect index |
SPEC compliance
Every feature in SPEC.md and TIER1.md is implemented and exercised by the dms-tests corpus.
License
Dual-licensed at your option:
统计信息
- 总下载量: 0
- 月度下载量: 0
- 日度下载量: 0
- 收藏数: 0
- 点击次数: 10
- 依赖项目数: 0
- 推荐数: 0
其他信息
- 授权协议: MIT
- 更新时间: 2026-05-05
