insightbase/invoice-parser-nette 问题修复 & 功能扩展

解决BUG、新增功能、兼容多环境部署,快速响应你的开发需求

邮箱:yvsm@zunyunkeji.com | QQ:316430983 | 微信:yvsm316

insightbase/invoice-parser-nette

最新稳定版本:v1.0.2

Composer 安装命令:

composer require insightbase/invoice-parser-nette

包简介

Nette package for parsing invoice/accounting documents from PDF (including scanned PDFs) using Azure Document Intelligence, LLM normalization and Czech-specific validation.

README 文档

README

Nette balik pro vytezovani faktur a ucetnich dokladu z PDF (vcetne skenu) pres:

  • Azure Document Intelligence (OCR + strukturovana extrakce)
  • LLM normalizaci (Azure OpenAI)
  • ceske regex fallbacky (VS, DUZP, ICO, DIC)
  • validacni vrstvu a asynchronni worker pattern

Instalace

composer require insightbase/invoice-parser-nette

Azure setup (API key + endpoint + deployment)

Niz je doporuceny postup. Urceno pro stav k 15. 3. 2026.

1) Azure ucet a subscription

  1. Vytvor nebo pouzij existujici Azure account.
  2. Over, ze mas aktivni subscription a opravneni aspon Contributor na resource group.

2) Azure Document Intelligence (azureDi)

  1. V Azure Portal vytvor resource typu Document Intelligence (historicky Form Recognizer).
  2. Vyber region, kde sluzbu chces provozovat.
  3. Po vytvoreni otevri Keys and Endpoint.
  4. Zkopiruj:
  • Endpoint -> pouzij jako AZURE_DI_ENDPOINT
  • Key 1 nebo Key 2 -> pouzij jako AZURE_DI_KEY

3) Azure OpenAI (llm)

  1. V Azure Portal vytvor resource Azure OpenAI.
  2. V resource otevri Keys and Endpoint.
  3. Zkopiruj:
  • Endpoint -> AZURE_OPENAI_ENDPOINT
  • Key 1 nebo Key 2 -> AZURE_OPENAI_KEY
  1. Otevri Azure AI Foundry / model deployment panel pro tento resource.
  2. Vytvor model deployment (napr. GPT model) a zapamatuj deployment name -> AZURE_OPENAI_DEPLOYMENT.

Poznamka:

  • Pokud nejde Azure OpenAI resource nebo deployment vytvorit, jde obvykle o chybejici quota/permission v tenantu nebo regionu. V tom pripade je potreba pozadat Azure admina o povoleni.

4) Promenne prostredi

Minimalne nastav:

AZURE_DI_ENDPOINT=https://<your-di-resource>.cognitiveservices.azure.com
AZURE_DI_KEY=<your-di-key>
AZURE_OPENAI_ENDPOINT=https://<your-openai-resource>.openai.azure.com
AZURE_OPENAI_KEY=<your-openai-key>
AZURE_OPENAI_DEPLOYMENT=<your-deployment-name>

5) Konfigurace extension v Nette

extensions:
    invoiceParser: InsightBase\InvoiceParserNette\DI\InvoiceParserExtension

invoiceParser:
    azureDi:
        endpoint: %env(AZURE_DI_ENDPOINT)%
        apiKey: %env(AZURE_DI_KEY)%
        model: prebuilt-invoice
        apiVersion: 2023-07-31
        maxPollAttempts: 25
        pollIntervalMs: 1000
    llm:
        enabled: true
        endpoint: %env(AZURE_OPENAI_ENDPOINT)%
        deployment: %env(AZURE_OPENAI_DEPLOYMENT)%
        apiKey: %env(AZURE_OPENAI_KEY)%
        apiVersion: 2024-10-21

6) Odkazy na oficialni dokumentaci

Pouziti

<?php

declare(strict_types=1);

use InsightBase\InvoiceParserNette\Parser\InvoiceParser;

final class InvoiceService
{
    public function __construct(
        private InvoiceParser $invoiceParser,
    ) {
    }

    public function parse(string $pdfPath): array
    {
        $pdfContent = file_get_contents($pdfPath);
        $result = $this->invoiceParser->parsePdf((string) $pdfContent);

        return $result->invoice->toArray();
    }
}

Asynchronni worker (Contributte RabbitMQ)

Knihovna obsahuje worker service InvoiceParseWorker::process(array $message).

Priklad payloadu zpravy:

{
  "pdfPath": "/data/invoices/invoice-2026-001.pdf"
}

Nebo:

{
  "pdfBase64": "JVBERi0xLjQKJ..."
}

Ukazkova integrace je v examples/rabbitmq.neon a examples/InvoiceConsumer.php.

Poznamky

  • Pro oskenovane PDF se OCR resi na strane Azure Document Intelligence.
  • Regex fallback slouzi jako doplnek, kdyz DI/LLM vrati neuplna data.
  • Validator hlida zakladni konzistenci castek a dat.

统计信息

  • 总下载量: 3
  • 月度下载量: 0
  • 日度下载量: 0
  • 收藏数: 0
  • 点击次数: 7
  • 依赖项目数: 0
  • 推荐数: 0

GitHub 信息

  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • 开发语言: PHP

其他信息

  • 授权协议: MIT
  • 更新时间: 2026-03-15

承接程序开发

PHP开发

VUE

Vue开发

前端开发

小程序开发

公众号开发

系统定制

数据库设计

云部署

网站建设

安全加固