定制 ptondereau/parquet-php 二次开发

按需修改功能、优化性能、对接业务系统,提供一站式技术支持

邮箱:yvsm@zunyunkeji.com | QQ:316430983 | 微信:yvsm316

ptondereau/parquet-php

最新稳定版本:v0.1.0

Composer 安装命令:

pie install ptondereau/parquet-php

包简介

PHP extension for reading and writing Apache Parquet files, powered by arrow-rs

README 文档

README

CI License: MIT

A PHP extension for reading and writing Apache Parquet files, powered by Rust and arrow-rs.

Supports all primitive types, nested types (lists, structs, maps), 6 compression codecs, column projection, and in-memory I/O. Requires PHP 8.2+.

Installation

PIE

pie install ptondereau/parquet-php

PIE downloads a prebuilt binary for your platform. If none is available, it builds from source (requires the Rust toolchain).

Prebuilt binaries

Prebuilt binaries are published on GitHub Releases for PHP 8.2–8.5 (NTS and ZTS):

OS Arch Libc
Linux x64 glibc, musl
Linux arm64 glibc, musl
macOS x64
macOS arm64

Build from source

# Requires Rust toolchain (https://rustup.rs)
cargo build --release
cp target/release/libparquet.so "$(php-config --extension-dir)/parquet.so"

Loading

; php.ini
extension=parquet.so

Usage

Defining a schema

use Parquet\Column;
use Parquet\Schema;

$schema = Schema::create([
    Column::int64('id')->required(),
    Column::string('name'),
    Column::double('price'),
    Column::boolean('active'),
    Column::date('created_at'),
    Column::decimal('amount', 10, 2),
]);

Writing a file

use Parquet\Writer;
use Parquet\WriterOptions;
use Parquet\Compression;

$writer = new Writer(
    WriterOptions::create()->withCompression(Compression::Snappy)
);

$writer->write('data.parquet', $schema, [
    ['id' => 1, 'name' => 'Alice', 'price' => 9.99, 'active' => true, 'created_at' => '2024-01-15', 'amount' => '123.45'],
    ['id' => 2, 'name' => 'Bob',   'price' => 4.50, 'active' => false, 'created_at' => '2024-02-20', 'amount' => '67.89'],
]);

Incremental writing

$writer = new Writer();
$writer->open('large.parquet', $schema);

foreach (array_chunk($rows, 10_000) as $batch) {
    $writer->writeBatch($batch);
}

$writer->close();

Reading a file

use Parquet\Reader;
use Parquet\ReaderOptions;

$reader = new Reader(
    ReaderOptions::create()
        ->withBatchSize(5000)
        ->withColumns(['id', 'name'])
);

$file = $reader->open('data.parquet');

echo $file->metadata()->rowCount(); // total rows

while ($batch = $file->readBatch()) {
    foreach ($batch as $row) {
        echo $row['name'];
    }
}

Nested types

$schema = Schema::create([
    Column::string('name')->required(),
    Column::list('tags', Column::string('element')),
    Column::struct('address', [
        Column::string('street'),
        Column::string('city'),
        Column::string('zip'),
    ]),
    Column::map('metadata', Column::string('key'), Column::int32('value')),
]);

$writer = new Writer();
$writer->write('nested.parquet', $schema, [
    [
        'name' => 'Alice',
        'tags' => ['admin', 'active'],
        'address' => ['street' => '123 Main St', 'city' => 'Paris', 'zip' => '75001'],
        'metadata' => ['login_count' => 42, 'score' => 100],
    ],
]);

In-memory I/O

$writer = new Writer();
$buffer = $writer->writeToString($schema, $rows);

$reader = new Reader();
$file = $reader->openString($buffer);

while ($batch = $file->readBatch()) {
    // ...
}

Compression

Six codecs are supported: Uncompressed, Snappy, Gzip, Brotli, Lz4Raw, Zstd.

use Parquet\Compression;
use Parquet\Encoding;

$options = WriterOptions::create()
    ->withCompression(Compression::Zstd)
    ->withColumnCompression('payload', Compression::Snappy)
    ->withColumnEncoding('id', Encoding::DeltaBinaryPacked);

Supported types

Type Column factory PHP representation
Boolean Column::boolean() bool
Int32 Column::int32() int
Int64 Column::int64() int
Float Column::float() float
Double Column::double() float
String Column::string() string
Date Column::date() string (Y-m-d)
DateTime Column::dateTime() string (ISO 8601)
Time Column::time() string (H:i:s.u)
Decimal Column::decimal() string
JSON Column::json() string
Enum Column::enum() string
UUID Column::uuid() string
Binary Column::binary() string
List Column::list() array
Struct Column::struct() array (assoc)
Map Column::map() array (assoc)

All columns are nullable by default. Call ->required() to enforce non-null.

For the full API, see parquet.stubs.php.

Contributing

See CONTRIBUTING.md.

License

MIT

统计信息

  • 总下载量: 0
  • 月度下载量: 0
  • 日度下载量: 0
  • 收藏数: 0
  • 点击次数: 7
  • 依赖项目数: 0
  • 推荐数: 0

GitHub 信息

  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • 开发语言: Rust

其他信息

  • 授权协议: MIT
  • 更新时间: 2026-03-20

承接程序开发

PHP开发

VUE

Vue开发

前端开发

小程序开发

公众号开发

系统定制

数据库设计

云部署

网站建设

安全加固