Streaming variant of the per-segment decode pipeline. As soon as SEQ_LEN
codes are accumulated from the talker/CP loop, BigVGAN is dispatched on
a background coroutine while the producer keeps generating the rest of
the segment. The BigVGAN consumer feeds a streaming crossfader that
emits stable audio as it arrives and holds back overlapSamples for the
next chunk's blend.
Mirrors decodeChunked's semantics exactly so final audio is bit-identical
modulo the fadeOut application location (now applied to the final
emission tail instead of the full buffer; the last 40ms still get faded).
Validated A/B on the same prompt 3 used in the recent benchmark:
prompt: "Je me sens un peu triste aujourdhui…"
seg 0 first audio: 14 485 ms → 10 936 ms (−3.5 s)
end-to-end first audio (LLM trigger → audio): 16.2 s → 12.7 s
Stream LLM total: 33 234 ms → 28 594 ms (−4.6 s)
Short segments (<SEQ_LEN codes) and the legacy non-streaming callers
(generateSegmentAudioVC, decodeChunked, multi-segment pipelines, etc.)
are untouched. The new path is gated behind USE_STREAMING_DECODE so it
can be reverted by flipping a single const if a regression is found.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>