kazeia/kazeia-android/app
Kazeia Team f17131aefb UI: reveal Kazeia reply in sync with TTS audio (per-sentence, per-word)
Matches the 'conversation' feel the user asked for. Previously the
full LLM response appeared in the chat as soon as generation
finished, then audio played 5–10 s later — text and sound felt
decoupled. Now:

- The KAZEIA bubble is created empty and only starts filling when
  the first TTS segment actually starts playing through the speaker
  (we already split the response by sentence for the chained-
  MediaPlayer pipeline; that split drives the reveal too).
- Inside each sentence, words are appended one by one at a cadence
  of (audio duration / word count) — slower sentences reveal slower,
  matching speech pacing. The first word of each sentence appears
  immediately so audio and text stay aligned at the start.

Implementation:
- Qwen3TtsEngine: added `onSegmentPlaying(sentence, durationMs)`
  listener, invoked from the chained-MediaPlayer worker the moment
  each segment's MediaPlayer.start() lands. Sentence + duration are
  carried end-to-end via a new SegmentReady data class.
- KazeiaPipeline.speakText: forwards an optional listener down to
  the TTS engine, same signature.
- KazeiaService: new updateMessageText(id, text) helper. In
  processLlmResponse, the bubble is added empty before speakText and
  grown by a reveal coroutine per sentence; after speakText returns
  we snap to the full text as a safety net.

No change to the stream_llm debug intent path — it still uses the
old enqueueSentence flow directly and doesn't need the reveal (no
UI bubble there).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 22:58:18 +02:00
..
src/main UI: reveal Kazeia reply in sync with TTS audio (per-sentence, per-word) 2026-04-14 22:58:18 +02:00
build.gradle.kts TTS tremor investigation: identify cross-arch numerical floor, gate diag flags 2026-04-13 00:15:14 +02:00
proguard-rules.pro Initial commit: Kazeia TTS pipeline on NPU via ExecuTorch 2026-04-09 08:42:11 +02:00