包简介

Whisper Module for Platform - Browser audio recording with AssemblyAI transcription + speaker diarization + LLM summary

README 文档

README

Audio-Aufnahme im Browser → AssemblyAI Transkription mit Speaker Diarization → LLM-Zusammenfassung. Audio-Datei wird nicht dauerhaft gespeichert, nur Transkript + Segmente + Summary bleiben persistent.

Features

Browser-Recorder via MediaRecorder API (Opus, mono)
Speaker Diarization: AssemblyAI markiert Sprecher (A, B, C, …) pro Äußerung
LLM-Summary: automatischer Titel + Bullet-Point-Zusammenfassung via OpenAI
Queue-basiert: Upload kehrt sofort zurück, Job verarbeitet im Hintergrund
Organization-Linking: HasOrganizationContexts (morph_alias whisper_recording), nutzbar über die Core-LLM-Tools

Voraussetzungen (Host-App)

1. Composer

"require": {
    "martin3r/platforms-whisper": "dev-main"
},
"repositories": [
    {
        "type": "vcs",
        "url": "git@github.com:martin3r-me/platforms-whisper.git"
    }
]

composer update martin3r/platforms-whisper
php artisan migrate

2. API Keys

# Transkription + Diarization
ASSEMBLYAI_API_KEY=...

# LLM-Summary (wiederverwendet OpenAiService der Platform)
OPENAI_API_KEY=sk-...

3. Queue Worker

php artisan queue:work --timeout=1800 --tries=1

Das Polling gegen AssemblyAI läuft während handle(). --timeout=1800 deckt auch stundenlange Meetings ab.

4. PHP Limits

In php.ini (oder .user.ini):

upload_max_filesize = 500M
post_max_size = 500M
max_execution_time = 300
memory_limit = 256M

Datenmodell

Tabelle whisper_recordings (Kern-Felder):

Feld	Typ	Bemerkung
id / uuid	PK	UuidV7
team_id / created_by_user_id	FK	Team-Scope
title	string	LLM-generiert (Fallback: erster Satz)
transcript	longText	Fließtext-Transkript
summary	longText	LLM-Bullet-Points
segments	json	`[{speaker, start, end, text}, …]`
speakers_count	int	Anzahl erkannter Sprecher
language	string	ISO-Code, AssemblyAI-detected
duration_seconds	int
model	string	z. B. `assemblyai:universal`
provider_id	string	AssemblyAI transcript id
status	enum	pending / processing / completed / failed
error_message	text	bei failed

Workflow

User klickt Aufnehmen auf /whisper
Browser nimmt Mic auf (Opus, mono)
Stop → Blob wird per fetch() an /whisper/upload POSTet
Controller speichert Blob in storage/app/whisper-tmp/{uuid}.webm, legt Recording mit status=pending an, dispatched TranscribeRecordingJob, redirected User zur Show-Page
Job (Worker):
- Status → processing
- AssemblyAiTranscriptionService::transcribe(): Upload → Submit (mit speaker_labels=true) → Polling bis completed
- WhisperSummaryService::summarize(): LLM erzeugt Titel + Summary
- Recording bekommt transcript, segments, speakers_count, summary, title, Status → completed
- Tmp-Datei wird gelöscht (finally-Block)
Show-Page pollt alle 3 s während pending/processing, zeigt danach Sprecher-Blöcke + Summary + Fließtext

Fehler-Handling

Kein ASSEMBLYAI_API_KEY → Job wirft Exception → Status failed
AssemblyAI-Fehler (Upload/Submit/Poll) → Status failed, error_message mit API-Response
Polling-Timeout (WHISPER_AAI_MAX_WAIT) → Status failed
Bei failed: Tmp-Datei wird trotzdem aufgeräumt

LLM-Tools

whisper.overview.GET — Modul-Übersicht
whisper.recordings.GET — Liste aller Aufnahmen
whisper.recording.GET — Einzel-Aufnahme (inkl. Segmente)
whisper.recordings.PUT — Metadaten updaten
whisper.recordings.DELETE — Aufnahme löschen
whisper.recordings.search.GET — Volltextsuche
whisper.recording.transcript.GET — Nur Transkript + Summary + Segments (LLM-freundlich)

Config Overrides

WHISPER_AAI_REQUEST_TIMEOUT=120     # HTTP-Timeout pro AssemblyAI-Call
WHISPER_AAI_POLL_INTERVAL=3         # Polling-Intervall in Sekunden
WHISPER_AAI_MAX_WAIT=1500           # Maximale Polling-Dauer
WHISPER_AAI_SPEAKER_LABELS=true     # Diarization an/aus
WHISPER_AAI_SPEAKERS_EXPECTED=0     # 0 = automatisch, sonst erwartete Anzahl

Out-of-Scope

Editierbares Transkript
Echtzeit-Streaming
Speaker-Identifikation (Zuordnung zu Personen/Namen) — liefert nur A, B, C …

martin3r/platforms-whisper 适用场景与选型建议

martin3r/platforms-whisper 是一款基于 PHP 开发的 Composer 扩展包，目前已累计 23 次下载、GitHub Stars 达 0，最近一次更新时间为 2026 年 04 月 08 日，在 PHP 生态内属于活跃度较高的组件。

我们在过去多个企业项目中使用过 martin3r/platforms-whisper 或与其功能相近的方案，如果你在选型或落地过程中遇到问题，例如 版本兼容、二次改造、私有化封装、与内部系统对接、生产 BUG 排查，欢迎联系我们协助评估。

围绕 martin3r/platforms-whisper 我们能提供哪些服务？

定制开发 / 二次开发

基于 martin3r/platforms-whisper 在你已有业务上做功能扩展、字段裁剪、UI 适配、与内部账号 / 权限 / 日志系统的深度对接。

BUG 修复 & 性能优化

线上偶发问题、内存泄漏、慢查询、并发异常等排查修复；针对高流量场景做缓存、队列、索引层面的调优。

项目外包 & 长期维护

承接完整的项目从需求 → 设计 → 开发 → 上线 → 长期运维；也可按月提供技术保姆服务。

yvsm@zunyunkeji.com QQ：316430983 微信：yvsm316 西安尊云信息科技 · 专注 PHP / Go / 分布式系统研发

martin3r/platforms-whisper

包简介