Overview
OfflineDiarizerManager provides the full pyannote/Core ML exporter pipeline (powerset segmentation + VBx clustering) for highest accuracy offline diarization.
Requires macOS 14 / iOS 17 or later.
Quick Start
File-Based API
For large files, use memory-mapped streaming:Pipeline Stages
- Segmentation — 10s/160k sample chunks through Core ML segmentation (589 frame-level log probabilities)
- Binarization — Log probabilities to soft VAD weights
- Weight Interpolation —
scipy.ndimage.zoom-compatible half-pixel mapping - Embedding Extraction — FBANK + embedding backend, L2-normalized 256-d embeddings
- VBx Clustering — AHC warm start + PLDA + iterative VBx refinement
- Timeline Reconstruction — Timestamps with minimum gap/duration constraints
Configuration
OfflineDiarizerConfig groups knobs by pipeline stage:
segmentation— Window length (10s), step ratio, min on/off durationsembedding— Batch size, overlap handlingclustering— VBx warm-start threshold, Fa/Fb priorsvbx— Max iterations, convergence tolerancepostProcessing— Minimum gap durationexport— OptionalembeddingsPathfor JSON dump
Benchmarks
VoxConverse (232 clips, multi-speaker conversations). Segmentation uses 10s windows:| Config | Audio Length | DER | JER | RTFx |
|---|---|---|---|---|
| Step ratio 0.2, min duration 1.0s (default) | 10s windows | 15.1% | 39.4% | 122x |
| Step ratio 0.1, min duration 0s (max accuracy) | 10s windows | 13.9% | 42.8% | 65x |