Multi-segment TTS for long text: split → generate → concatenate
- prepare_tts_segments.py: splits text at sentence boundaries,
generates Python pre-computed embeds per segment
- Kotlin: detects multi-segment file format, processes each segment
independently (fresh KV cache), concatenates audio
- Long text tested: 3 segments, 335 tokens, 26.8s audio, RTF 1.67
File format: n_segments, then per segment: nPrefill, nTotal, embeds[]
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>