kazeia

Go to file

Kazeia Team 10fd10fd90 TTS: overlap CP↔BigVGAN — first audio 14.5s → 10.9s per segment Streaming variant of the per-segment decode pipeline. As soon as SEQ_LEN codes are accumulated from the talker/CP loop, BigVGAN is dispatched on a background coroutine while the producer keeps generating the rest of the segment. The BigVGAN consumer feeds a streaming crossfader that emits stable audio as it arrives and holds back overlapSamples for the next chunk's blend. Mirrors decodeChunked's semantics exactly so final audio is bit-identical modulo the fadeOut application location (now applied to the final emission tail instead of the full buffer; the last 40ms still get faded). Validated A/B on the same prompt 3 used in the recent benchmark: prompt: "Je me sens un peu triste aujourdhui…" seg 0 first audio: 14 485 ms → 10 936 ms (−3.5 s) end-to-end first audio (LLM trigger → audio): 16.2 s → 12.7 s Stream LLM total: 33 234 ms → 28 594 ms (−4.6 s) Short segments (<SEQ_LEN codes) and the legacy non-streaming callers (generateSegmentAudioVC, decodeChunked, multi-segment pipelines, etc.) are untouched. The new path is gated behind USE_STREAMING_DECODE so it can be reverted by flipping a single const if a regression is found. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>		2026-04-14 16:22:15 +02:00
executorch-custom	TTS tremor investigation: identify cross-arch numerical floor, gate diag flags	2026-04-13 00:15:14 +02:00
executorch-patches	LLM: enable hybrid-mode export via num_sharding=1 — TTFT 2.9s → 113ms	2026-04-14 15:08:31 +02:00
kazeia-android	TTS: overlap CP↔BigVGAN — first audio 14.5s → 10.9s per segment	2026-04-14 16:22:15 +02:00
scripts	TTS: conditional tail-trim + export script accepts voice path arg	2026-04-13 11:32:33 +02:00
.gitignore	Initial commit: Kazeia TTS pipeline on NPU via ExecuTorch	2026-04-09 08:42:11 +02:00
AI_HUB_QUALCOMM.md	Initial commit: Kazeia TTS pipeline on NPU via ExecuTorch	2026-04-09 08:42:11 +02:00
ARCHITECTURE_PIPELINE.md	Initial commit: Kazeia TTS pipeline on NPU via ExecuTorch	2026-04-09 08:42:11 +02:00
AVATAR_3D_RAPPORT.md	Initial commit: Kazeia TTS pipeline on NPU via ExecuTorch	2026-04-09 08:42:11 +02:00
BENCHMARK_RAPPORT.md	Initial commit: Kazeia TTS pipeline on NPU via ExecuTorch	2026-04-09 08:42:11 +02:00
BENCHMARK_ROOT_VS_NONROOT.md	Initial commit: Kazeia TTS pipeline on NPU via ExecuTorch	2026-04-09 08:42:11 +02:00
DEPLOY_EXECUTORCH_NPU.md	Initial commit: Kazeia TTS pipeline on NPU via ExecuTorch	2026-04-09 08:42:11 +02:00
DOCUMENTATION_KAZEIA.txt	Initial commit: Kazeia TTS pipeline on NPU via ExecuTorch	2026-04-09 08:42:11 +02:00
GUIDE_ROOT_ONEPLUS_PAD3.md	Initial commit: Kazeia TTS pipeline on NPU via ExecuTorch	2026-04-09 08:42:11 +02:00
KAZEIA-CLAUDE.md	Initial commit: Kazeia TTS pipeline on NPU via ExecuTorch	2026-04-09 08:42:11 +02:00
RAPPORT_TTS.md	Initial commit: Kazeia TTS pipeline on NPU via ExecuTorch	2026-04-09 08:42:11 +02:00
TTS_CALIBRATION_GUIDE.md	Initial commit: Kazeia TTS pipeline on NPU via ExecuTorch	2026-04-09 08:42:11 +02:00
TTS_GPU_GUIDE.md	Initial commit: Kazeia TTS pipeline on NPU via ExecuTorch	2026-04-09 08:42:11 +02:00
TTS_HEXAGON_NPU_GUIDE.md	Initial commit: Kazeia TTS pipeline on NPU via ExecuTorch	2026-04-09 08:42:11 +02:00
TTS_RAPPORT_COMPLET.md	Initial commit: Kazeia TTS pipeline on NPU via ExecuTorch	2026-04-09 08:42:11 +02:00
TTS_REPORT.md	Initial commit: Kazeia TTS pipeline on NPU via ExecuTorch	2026-04-09 08:42:11 +02:00
kazeia-architecture.md	Initial commit: Kazeia TTS pipeline on NPU via ExecuTorch	2026-04-09 08:42:11 +02:00
kazeia-no-root-report.md	docs: add before/after performance comparison to no-root report	2026-04-14 11:37:15 +02:00