Even with /no_think in the system prompt Qwen3 still emits an empty
<think>…</think> wrapper before the real answer. Without filtering, the
SentenceStreamer treats '<think>' as a sentence boundary and feeds three
tokens of XML into the TTS, producing audible parasites at the start of
each reply.
The new in-callback filter buffers a small lookahead (just enough to span
"</think>"), suppresses everything between the open and close tags, and
flushes the surrounding prose to onToken in order. With the lookahead, tags
that arrive split across decoded pieces ("<thi"+"nk>") still match.
Validated end-to-end: prompt 'Bonjour, comment vas-tu ?' now streams
sentence-by-sentence to the TTS — first segment "Bonjour !" reaches the
talker at 4.6 s, no <think> sneak-through.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>