gtstudio/module-ai-knowledge-base
最新稳定版本:1.0.3
Composer 安装命令:
composer require gtstudio/module-ai-knowledge-base
包简介
Knowledge base management for Magento 2. Upload documents (PDF, TXT) that AI agents can retrieve as context before answering queries.
README 文档
README
Document management for AI agents in Magento 2. Upload files that agents can retrieve as context before answering queries — enabling retrieval-augmented generation (RAG) without a vector database.
Preview
AI Studio Ecosystem
Part of the AI Studio suite for Magento 2. See all modules:
| Module | Repository | Description |
|---|---|---|
| Gtstudio_AiConnector | module-aiconnector | Core AI provider abstraction |
| Gtstudio_AiAgents | module-ai-agents | Agent & tool orchestration, cron scheduling, execution log |
| Gtstudio_AiWidgets | module-ai-widgets | Floating admin chat widget + PageBuilder AI generator |
| Gtstudio_AiDataQuery | module-ai-data-query | Natural-language store analytics (privacy-first) |
| Gtstudio_AiKnowledgeBase | (this module) | Document upload & RAG retrieval for agents |
| Gtstudio_AiDashboard | module-ai-dashboard | AI-powered KPI dashboard with ML insights |
What It Does
- Upload and manage documents (PDF, TXT) in the Magento admin
- Documents are stored and indexed so that agents can fetch relevant excerpts at query time
- Integrates with
Gtstudio_AiAgents— assign a knowledge base to any agent
Requirements
- Magento 2.4.4+
- PHP 8.1+
Gtstudio_AiConnectorenabled and configuredGtstudio_AiAgentsenabledsmalot/pdfparser: ^2.12(PDF text extraction)
Installation
composer require gtstudio/module-ai-knowledge-base php bin/magento module:enable Gtstudio_AiKnowledgeBase php bin/magento setup:upgrade
Usage
Uploading Documents
Navigate to AI Studio → Agents & Tools → Knowledge Base.
Click Add New, fill in:
| Field | Description |
|---|---|
| Title | Human-readable label (auto-populated from PDF metadata on upload) |
| Upload PDF Document | Upload a PDF file — text and metadata are extracted automatically |
| Content | Extracted text (editable; used for retrieval) |
| Tags | Comma-separated keywords (auto-populated from PDF metadata) |
| Agents | Associate this document with one or more agents |
| Is Active | Only active entries are searchable by agents |
How Retrieval Works
When an agent that has knowledge base documents attached receives a question:
- The question is matched against document excerpts using keyword or semantic similarity
- Relevant excerpts are prepended to the agent's system prompt as context
- The agent responds with awareness of those excerpts
No full document text is sent to the LLM — only the most relevant excerpts, keeping token usage low.
Extensibility
Supporting Additional File Formats
The text extraction pipeline uses a registry pattern. Register a custom extractor for a new MIME type:
<!-- etc/di.xml --> <type name="Gtstudio\AiKnowledgeBase\Model\Extractor\ExtractorPool"> <arguments> <argument name="extractors" xsi:type="array"> <item name="application/vnd.openxmlformats-officedocument.wordprocessingml.document" xsi:type="object"> Vendor\Module\Model\Extractor\DocxExtractor </item> </argument> </arguments> </type>
Implement Gtstudio\AiKnowledgeBase\Api\ExtractorInterface:
interface ExtractorInterface { /** * Extract plain text from the given file path. */ public function extract(string $filePath): string; }
Custom Retrieval Strategy
Override the retrieval service to use a vector database, OpenSearch k-NN, or any other similarity search:
<preference for="Gtstudio\AiKnowledgeBase\Api\RetrievalServiceInterface" type="Vendor\Module\Model\VectorRetrievalService"/>
Chunking Strategy
Document chunking (splitting documents into excerpt-sized pieces) can be customised:
<type name="Gtstudio\AiKnowledgeBase\Model\Chunker\TextChunker"> <arguments> <!-- Maximum characters per chunk --> <argument name="chunkSize" xsi:type="number">1500</argument> <!-- Overlap between consecutive chunks --> <argument name="overlap" xsi:type="number">200</argument> </arguments> </type>
Database Tables
| Table | Purpose |
|---|---|
gtstudio_ai_knowledge_base |
Document metadata (name, description, file path, agent association) |
gtstudio_ai_knowledge_base_chunk |
Extracted text chunks ready for retrieval |
ACL Resources
| Resource | Controls |
|---|---|
Gtstudio_AiKnowledgeBase::management |
Access to the Knowledge Base admin section |
统计信息
- 总下载量: 8
- 月度下载量: 0
- 日度下载量: 0
- 收藏数: 0
- 点击次数: 4
- 依赖项目数: 0
- 推荐数: 0
其他信息
- 授权协议: BUSL-1.1
- 更新时间: 2026-03-09
