- ExecuTorchLlmEngine: system prompt forces French, 1-2 short sentences,
/no_think so the full budget goes to the answer (Qwen3 was consuming
120+ tokens on <think>); eval_mode 0 matches our kv-mode export.
- Qwen3TtsEngine.generateSegmentAudioVC: when the Hexagon talker socket
isn't open, fall back to runInterleavedPteFromEmbeds so the Stage 3
streaming session still produces audio. Without this the session opened,
accepted sentences, and silently emitted empty PCM.
Documents the QNN SDK version-skew pitfall in ExecuTorchLlmEngine.kt
ahead of the upcoming migration to a unified v2.42 toolchain.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>