Replaces the fixed maxGen + length-based boost with a fully dynamic
end-of-utterance detector that watches the model's own EOS logit rank.
End result on the Baer 3-segment monologue, validated by user as
"FORMIDABLE" / "impeccable" with both Damien and Zelda voices:
- All 3 segments terminate via EOS (no maxGen cap hit)
- No "page beg beg" filler tail
- No abrupt cuts between segments
- Audio durations 5-8 s per segment, matching Python within ~10 %
How it works (runHexGenWithPrefill, in tts/Qwen3TtsEngine.kt):
1. At every decode step, compute the rank of CODEC_EOS in the
repetition-penalised logits. Mid-utterance the rank sits at
150-700 (model is committed to producing speech). Approaching
the natural end, the rank dips toward top-50.
2. Arm the boost only when EOS rank stays below eosRankTrigger=60
for THREE consecutive steps. The 3-step requirement filters out
transient single-step dips that occur during low-energy phonemes
mid-sentence (without it, short sentences would terminate after
~3 s). Arming is also gated by eosBoostMinStep (50 % of expected
speech length) so we never arm in the very first frames.
3. Once armed, the boost increments monotonically: each subsequent
step adds boostStepsActive * eosBoostScale to the EOS logit. The
accumulated boost lifts EOS above top-1 within 1-3 steps, the
argmax check fires, and the loop breaks. Scale=4 gives the model
a small natural decay before termination; scale=5 was perfect-but-
slightly-clipping, scale=3 wasn't strong enough to outpace the
growing top-1 logit.
Other tweaks bundled in this commit because they all contribute to
the clean output:
* Inter-segment gap 120 → 250 ms — gives the listener a perceived
sentence boundary instead of a hard concatenation.
* fadeOut(audio, 40) on every segment — cosine roll-off over the
last 40 ms so the EOS-clipped tail decays naturally instead of
sample-clipping.
* top_k 50 → 200 in the fallback sample call — wider pool to keep
EOS reachable when the boost just fails to hit argmax.
Voice swap is a 45 KB file push (damien_voice_prefix.bin and
damien_voice_suffix.bin). Successfully tested today with Elodie
(female, norm 10.12) and Zelda (norm 9.39) using Damien (norm 10.36)
as the baseline — same Kotlin code, no rebuild needed.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>