wp-php-toolkit/xml 问题修复 & 功能扩展

解决BUG、新增功能、兼容多环境部署,快速响应你的开发需求

邮箱:yvsm@zunyunkeji.com | QQ:316430983 | 微信:yvsm316

wp-php-toolkit/xml

最新稳定版本:v0.7.5

Composer 安装命令:

composer require wp-php-toolkit/xml

包简介

XML component for WordPress.

README 文档

README

slug xml
title XML
install wp-php-toolkit/xml
see_also
dataliberation | DataLiberation | Read and write WXR-sized WordPress exports as entities. encoding | Encoding | Validate and scrub text before strict XML processing. bytestream | ByteStream | Keep large XML reads incremental.

A streaming, namespace-aware XML processor in pure PHP. Read and modify huge feeds, WXR exports, ePub manifests, and Office Open XML parts without ever loading the document into memory and without depending on libxml2.

Why this exists

SimpleXMLElement and DOMDocument both need libxml2 and both build a complete in-memory tree. XMLProcessor walks the document forward as a cursor, keeps modifications in a side buffer, and emits the full updated XML with get_updated_xml() only when you ask for it.

This design came from WordPress-scale documents such as WXR exports. A migration may only need to rewrite wp:attachment_url values or bump a feed attribute, so the processor optimizes for targeted cursor edits instead of a full validating XML stack.

Footgun: Namespace-aware methods use the namespace URI, not the prefix written in the tag. In WXR, get_attribute( 'wp', 'status' ) looks for a namespace literally named wp; for the usual WXR declaration you want get_attribute( 'http://wordpress.org/export/1.2/', 'status' ).

Footgun: In streaming mode next_tag() can return false because input ran out, not because the document ended. Check is_paused_at_incomplete_input() before assuming you're done.

Bump every price in a catalog

Find each <book>, read its price, write a new one, emit the updated document.

<?php
require '/wordpress/wp-content/php-toolkit/vendor/autoload.php';

use WordPress\XML\XMLProcessor;

$xml = <<<'XML'
<catalog>
<book sku="A1" price="29.99"><title>PHP Internals</title></book>
<book sku="A2" price="14.50"><title>WordPress at Scale</title></book>
</catalog>
XML;

$p = XMLProcessor::create_from_string( $xml );
while ( $p->next_tag( 'book' ) ) {
	$old = (float) $p->get_attribute( '', 'price' );
	$new = number_format( $old * 1.10, 2, '.', '' );
	$p->set_attribute( '', 'price', $new );
}

echo $p->get_updated_xml();
<catalog>
<book sku="A1" price="32.99"><title>PHP Internals</title></book>
<book sku="A2" price="15.95"><title>WordPress at Scale</title></book>
</catalog>

Read namespaced attributes from a WXR export

WordPress's WXR commonly uses wp:, dc:, and content: prefixes bound to namespace names such as http://wordpress.org/export/1.2/. Pass that expanded namespace name, not the prefix; the processor handles whichever prefix the document actually uses.

<?php
require '/wordpress/wp-content/php-toolkit/vendor/autoload.php';

use WordPress\XML\XMLProcessor;

$wxr = <<<'XML'
<?xml version="1.0"?>
<rss xmlns:wp="http://wordpress.org/export/1.2/" xmlns:dc="http://purl.org/dc/elements/1.1/">
<channel><item>
<title>Hello World</title>
<dc:creator>admin</dc:creator>
<wp:post_id>42</wp:post_id>
<wp:status>publish</wp:status>
</item></channel></rss>
XML;

$WP = 'http://wordpress.org/export/1.2/';
$DC = 'http://purl.org/dc/elements/1.1/';

$p = XMLProcessor::create_from_string( $wxr );
while ( $p->next_tag( 'item' ) ) {
	while ( $p->next_token() ) {
		if ( $p->is_tag_closer() && 'item' === $p->get_tag_local_name() ) break;
		if ( ! $p->is_tag_opener() ) continue;
		$ns = $p->get_tag_namespace();
		$local = $p->get_tag_local_name();
		$prefix = ( $WP === $ns ) ? 'wp/' : ( ( $DC === $ns ) ? 'dc/' : '' );
		echo "{$prefix}{$local}: ";
		while ( $p->next_token() && '#text' !== $p->get_token_name() ) {}
		echo trim( $p->get_modifiable_text() ) . "\n";
	}
}
title: Hello World
dc/creator: admin
wp/post_id: 42
wp/status: publish

Rewrite URLs across an entire WXR export

Large WXR exports can hold many URLs in <link>, <guid>, and post content. Streaming the file lets you rewrite large exports without loading the whole XML document into memory.

<?php
require '/wordpress/wp-content/php-toolkit/vendor/autoload.php';

use WordPress\XML\XMLProcessor;

$wxr = <<<'XML'
<?xml version="1.0"?><rss xmlns:wp="http://wordpress.org/export/1.2/"><channel>
<wp:base_site_url>https://old.example.com</wp:base_site_url>
<item><link>https://old.example.com/2024/post-1</link>
<guid>https://old.example.com/?p=1</guid></item>
</channel></rss>
XML;

$from = 'https://old.example.com';
$to   = 'https://new.example.com';

$p = XMLProcessor::create_from_string( $wxr );
$rewritten = 0;

while ( $p->next_token() ) {
	if ( '#text' !== $p->get_token_name() ) continue;
	$text = $p->get_modifiable_text();
	if ( false === strpos( $text, $from ) ) continue;
	$p->set_modifiable_text( str_replace( $from, $to, $text ) );
	$rewritten++;
}

echo "rewrote {$rewritten} text nodes\n\n";
echo $p->get_updated_xml();
rewrote 3 text nodes

<?xml version="1.0"?><rss xmlns:wp="http://wordpress.org/export/1.2/"><channel>
<wp:base_site_url>https://new.example.com</wp:base_site_url>
<item><link>https://new.example.com/2024/post-1</link>
<guid>https://new.example.com/?p=1</guid></item>
</channel></rss>

Parse OPML to extract feed URLs

OPML is the format Feedly and many readers use to import/export feed lists. Flat, attribute-heavy XML — exactly what a tag processor handles best.

<?php
require '/wordpress/wp-content/php-toolkit/vendor/autoload.php';

use WordPress\XML\XMLProcessor;

$opml = <<<'XML'
<?xml version="1.0"?><opml version="2.0"><head><title>My Feeds</title></head>
<body>
<outline text="Tech"><outline text="Hacker News" type="rss" xmlUrl="https://news.ycombinator.com/rss"/>
<outline text="LWN" type="rss" xmlUrl="https://lwn.net/headlines/rss"/></outline>
<outline text="WordPress" type="rss" xmlUrl="https://wordpress.org/news/feed/"/>
</body></opml>
XML;

$p = XMLProcessor::create_from_string( $opml );
while ( $p->next_tag( 'outline' ) ) {
	$url = $p->get_attribute( '', 'xmlUrl' );
	if ( null === $url ) continue;
	echo $p->get_attribute( '', 'text' ) . "\t" . $url . "\n";
}
Hacker News	https://news.ycombinator.com/rss
LWN	https://lwn.net/headlines/rss
WordPress	https://wordpress.org/news/feed/

统计信息

  • 总下载量: 47.01k
  • 月度下载量: 0
  • 日度下载量: 0
  • 收藏数: 0
  • 点击次数: 1
  • 依赖项目数: 2
  • 推荐数: 0

GitHub 信息

  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • 开发语言: PHP

其他信息

  • 授权协议: GPL-2.0-or-later
  • 更新时间: 2025-09-06

承接程序开发

PHP开发

VUE

Vue开发

前端开发

小程序开发

公众号开发

系统定制

数据库设计

云部署

网站建设

安全加固