UI: large central orb w/ spectrum-inside + per-voice palette

Complete redesign of AudioVisualizerView based on feedback: the orb
is now the app's visual face, takes the top ~60% of the chat area,
and has clearly distinct behaviour in each state.

- **Idle**: slow 5 s breathing (scale 0.88 → 1.00 via cos easing),
  pure round shape, soft halo in phase. No high-frequency motion.

- **Listening**: organic blob outline built from 8 Fourier modes
  whose amplitude scales with live mic RMS; a thin shimmering arc
  rotates around the orb while mic energy is present; continuous
  micro-ripples pulse outward. Looks clearly 'alive and attentive'
  vs Idle's static breathing.

- **Speaking**: the orb becomes a contained spectrometer. A pre-
  computed log-spaced spectrogram (12 bands, 120 Hz–4 kHz,
  Hann-windowed FFT, one column per 50 ms of audio) is rendered as
  vertical rounded-rectangle bars CLIPPED to the sphere outline so
  they really look like the sphere itself speaking. Bar heights
  interpolate between spectrogram frames and exponentially smooth
  toward the target for fluid 60 fps motion. Outer halo pulses with
  the RMS envelope; ripples release on envelope peaks.

- **Per-voice color**. Eight-entry palette (Damien lavender,
  Elodie rose, Jerome aqua, Richard amber, Amir emerald, Didier
  indigo, Sid peach, Zelda periwinkle). Halo, accent, bars, ring,
  and ripples are all derived from a single voiceColor so switching
  the voice spinner tweens the entire scene to the new identity
  over a few frames. Color stored on both KazeiaService (for
  persistence across process/view rebinds) and pushed directly to
  the view for instant feedback at selection time.

Sidecar pipeline changes:
- Qwen3TtsEngine now computes per-segment spectrogram alongside the
  RMS envelope (new computeSpectrogram + an in-place radix-2 FFT).
  FFT_SIZE = 1024, hop = 50 ms, 12 log-spaced bands.  SegmentReady
  carries both arrays; onSegmentPlaying is (sentence, durationMs,
  rmsEnvelope, spectrogram).
- KazeiaPipeline.speakText forwards the new callback shape.
- KazeiaService.VisualizerSignal.Speaking now carries the
  spectrogram and the new voiceColor StateFlow.
- ChatActivity passes both to the view and collects voiceColor.

Layout: vertical chain between audioViz (weight 3) and rvMessages
(weight 2) so the orb owns ~60% of the chat panel and the chat list
takes the remainder. Removed the fixed 140 dp constraint.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Kazeia Team 2026-04-14 23:33:38 +02:00
parent 8939c680b2
commit 06dcd76dcb
6 changed files with 593 additions and 160 deletions

View File

@ -150,7 +150,12 @@ class KazeiaPipeline {
// processLlmResponse to defer the KAZEIA chat bubble appearance
// until sound is audible, pace word-by-word reveal inside the
// bubble, and drive the AudioVisualizerView orb.
onSegmentPlaying: ((sentence: String, durationMs: Long, rmsEnvelope: FloatArray) -> Unit)? = null
onSegmentPlaying: ((
sentence: String,
durationMs: Long,
rmsEnvelope: FloatArray,
spectrogram: Array<FloatArray>
) -> Unit)? = null
) {
val ttsEngine = tts ?: return
_pipelineState.value = PipelineState.Speaking

View File

@ -92,11 +92,25 @@ class KazeiaService : Service() {
sealed class VisualizerSignal {
object Idle : VisualizerSignal()
data class Listening(val micRms: Float) : VisualizerSignal()
data class Speaking(val rmsEnvelope: FloatArray, val durationMs: Long) : VisualizerSignal()
data class Speaking(
val rmsEnvelope: FloatArray,
val spectrogram: Array<FloatArray>,
val durationMs: Long
) : VisualizerSignal()
}
private val _visualizerSignal = MutableStateFlow<VisualizerSignal>(VisualizerSignal.Idle)
val visualizerSignal: StateFlow<VisualizerSignal> = _visualizerSignal
// Kazeia's orb color is bound to the selected voice so the user
// visually associates a palette with the speaker they picked. UI
// sets this whenever the voice spinner changes; the orb view
// listens via the StateFlow and tweens the current → target color.
private val _voiceColor = MutableStateFlow(0xFFBCA4E8.toInt()) // lavender = Damien default
val voiceColor: StateFlow<Int> = _voiceColor
/** Called by the UI whenever the voice selector changes. */
fun setVoiceColor(color: Int) { _voiceColor.value = color }
private val _debugMode = MutableStateFlow(false)
val debugMode: StateFlow<Boolean> = _debugMode
@ -1238,12 +1252,14 @@ class KazeiaService : Service() {
var revealedSoFar = ""
val revealJobs = mutableListOf<kotlinx.coroutines.Job>()
try {
pipeline.speakText(responseText) { sentence, durationMs, envelope ->
// Push the envelope to the visualizer at the same
// moment the MediaPlayer starts playing so the orb
// reacts to this segment's actual energy.
pipeline.speakText(responseText) { sentence, durationMs, envelope, spectrogram ->
// Push the envelope + spectrogram to the
// visualizer at the same moment the MediaPlayer
// starts playing so the orb reacts to this
// segment's actual energy and the in-sphere
// spectrum bars match the audio content.
_visualizerSignal.value =
VisualizerSignal.Speaking(envelope, durationMs)
VisualizerSignal.Speaking(envelope, spectrogram, durationMs)
// Start a coroutine that appends one word at a time
// over the segment's audio duration. Words are
// separated on whitespace; punctuation rides with

View File

@ -113,6 +113,14 @@ class Qwen3TtsEngine(
// = 1200 samples/window — small enough for a 60 fps visualizer to
// track formants, large enough to run at negligible CPU cost.
const val ENVELOPE_WINDOW_MS = 50
// FFT size for the spectrum-in-sphere sidecar. 1024 samples at
// 24 kHz = 43 ms — slightly narrower than the hop so each frame
// gives a clean snapshot centered on its hop boundary.
private const val FFT_SIZE = 1024
// Number of log-spaced bands 120 Hz4 kHz rendered as vertical
// bars inside the sphere during Speaking. 12 feels like a real
// spectrometer without cluttering at smaller sphere sizes.
const val SPECTRUM_BANDS = 12
}
private var ortEnv: OrtEnvironment? = null
@ -3383,15 +3391,17 @@ class Qwen3TtsEngine(
/**
* Fires the moment a synthesized segment starts playing through the
* speaker. [sentence] is the original text submitted to
* [enqueueSentence], [durationMs] is the WAV duration so the caller
* can drive a progressive-reveal UI timer matched to speech pacing,
* and [rmsEnvelope] is a per-[ENVELOPE_WINDOW_MS] normalized RMS
* sidecar the UI can use to drive an audio-reactive visualizer
* without needing access to the live PCM stream from MediaPlayer.
* Set before calling [startStreamingSession]; cleared on session end.
* speaker. Carries the sentence text, audio duration, per-window RMS
* envelope (for orb amplitude) and per-window log-spaced band
* spectrogram (for the spectrum-in-sphere visualizer). All three
* share the same time axis one entry per [ENVELOPE_WINDOW_MS].
*/
var onSegmentPlaying: ((sentence: String, durationMs: Long, rmsEnvelope: FloatArray) -> Unit)? = null
var onSegmentPlaying: ((
sentence: String,
durationMs: Long,
rmsEnvelope: FloatArray,
spectrogram: Array<FloatArray>
) -> Unit)? = null
private fun startStreamingSessionMp() {
if (sessionMpQueue != null) return
@ -3422,8 +3432,9 @@ class Qwen3TtsEngine(
saveWav(wavPath, audio)
val durationMs = audio.size * 1000L / SR
val envelope = computeRmsEnvelope(audio)
nlog("MP seg $segIdx synthesized (${System.currentTimeMillis() - tSynth}ms, ${durationMs}ms audio, ${envelope.size} env windows), queued for playback")
wavChan.send(SegmentReady(segIdx, wavPath, sentence, durationMs, envelope))
val spectrogram = computeSpectrogram(audio)
nlog("MP seg $segIdx synthesized (${System.currentTimeMillis() - tSynth}ms, ${durationMs}ms audio, ${envelope.size} env × ${SPECTRUM_BANDS} bands), queued for playback")
wavChan.send(SegmentReady(segIdx, wavPath, sentence, durationMs, envelope, spectrogram))
} catch (e: Exception) {
nlog("MP synth error: ${e.message}")
}
@ -3484,7 +3495,7 @@ class Qwen3TtsEngine(
current = prepareMp(first.wavPath, first.segIdx)
current!!.setOnCompletionListener { it.release() }
current!!.start()
try { onSegmentPlaying?.invoke(first.sentence, first.durationMs, first.rmsEnvelope) } catch (_: Exception) {}
try { onSegmentPlaying?.invoke(first.sentence, first.durationMs, first.rmsEnvelope, first.spectrogram) } catch (_: Exception) {}
nlog("MP seg ${first.segIdx} started (chained, ${first.durationMs}ms)")
while (true) {
@ -3506,7 +3517,7 @@ class Qwen3TtsEngine(
// `next` player was chained via setNextMediaPlayer and has
// auto-started at this point; notify the UI so it can start
// revealing the sentence in sync with the audio.
try { onSegmentPlaying?.invoke(currentInfo!!.sentence, currentInfo!!.durationMs, currentInfo!!.rmsEnvelope) } catch (_: Exception) {}
try { onSegmentPlaying?.invoke(currentInfo!!.sentence, currentInfo!!.durationMs, currentInfo!!.rmsEnvelope, currentInfo!!.spectrogram) } catch (_: Exception) {}
next = null
nextInfo = null
}
@ -3535,7 +3546,8 @@ class Qwen3TtsEngine(
val wavPath: String,
val sentence: String,
val durationMs: Long,
val rmsEnvelope: FloatArray
val rmsEnvelope: FloatArray,
val spectrogram: Array<FloatArray>
)
/** Compute a per-ENVELOPE_WINDOW_MS normalized RMS envelope from a
@ -3564,6 +3576,103 @@ class Qwen3TtsEngine(
return env
}
/** Compute a per-window log-spaced band spectrogram used by the
* spectrum-in-sphere visualizer. Time axis aligned with the RMS
* envelope (one column per ENVELOPE_WINDOW_MS). FFT size is 1024
* samples (~43 ms at 24 kHz), windowed with Hann and centered on
* each hop. [SPECTRUM_BANDS] log-spaced bands from 120 Hz to
* 4 kHz covers the vocal formant range without wasting visual
* space on silent sub-100 Hz or frictive >4 kHz content. */
private fun computeSpectrogram(audio: ShortArray): Array<FloatArray> {
if (audio.isEmpty()) return emptyArray()
val fftSize = FFT_SIZE
val hopSamples = SR * ENVELOPE_WINDOW_MS / 1000
val nFrames = (audio.size + hopSamples - 1) / hopSamples
// Pre-compute band edges as FFT bin indices.
val binHzRes = SR.toDouble() / fftSize
val fMin = 120.0; val fMax = 4000.0
val bandEdges = IntArray(SPECTRUM_BANDS + 1) { i ->
val f = fMin * Math.pow(fMax / fMin, i.toDouble() / SPECTRUM_BANDS)
(f / binHzRes).toInt().coerceIn(1, fftSize / 2 - 1)
}
// Hann window — reduces spectral leakage, gives cleaner bars.
val hann = FloatArray(fftSize) { i ->
(0.5 - 0.5 * Math.cos(2.0 * Math.PI * i / (fftSize - 1))).toFloat()
}
val re = FloatArray(fftSize)
val im = FloatArray(fftSize)
val result = Array(nFrames) { FloatArray(SPECTRUM_BANDS) }
for (f in 0 until nFrames) {
// Center the window on the hop midpoint.
val center = f * hopSamples + hopSamples / 2
val start = center - fftSize / 2
for (i in 0 until fftSize) {
val idx = start + i
val sample = if (idx in audio.indices) audio[idx].toFloat() / 32768f else 0f
re[i] = sample * hann[i]
im[i] = 0f
}
fftInPlace(re, im)
for (b in 0 until SPECTRUM_BANDS) {
val bStart = bandEdges[b]
val bEnd = bandEdges[b + 1].coerceAtLeast(bStart + 1)
var sum = 0.0
for (k in bStart until bEnd) {
val reK = re[k].toDouble(); val imK = im[k].toDouble()
sum += reK * reK + imK * imK
}
val mag = Math.sqrt(sum / (bEnd - bStart))
// Log-compress + normalize. Speech energy per band rarely
// exceeds ~0.1 before log; the constants below bring the
// typical range to [0.2, 0.95] for visible bar motion.
result[f][b] = (Math.log10(1.0 + mag * 80) / Math.log10(7.0))
.toFloat().coerceIn(0f, 1f)
}
}
return result
}
/** In-place radix-2 CooleyTukey FFT. Size must be a power of 2. */
private fun fftInPlace(re: FloatArray, im: FloatArray) {
val n = re.size
// Bit-reversal permutation.
var j = 0
for (i in 1 until n) {
var bit = n shr 1
while (j and bit != 0) { j = j xor bit; bit = bit shr 1 }
j = j or bit
if (i < j) {
val tr = re[i]; re[i] = re[j]; re[j] = tr
val ti = im[i]; im[i] = im[j]; im[j] = ti
}
}
// Butterflies.
var size = 2
while (size <= n) {
val half = size / 2
val step = n / size
val angleBase = -2.0 * Math.PI / size
var m = 0
while (m < n) {
var k = 0
for (i in m until m + half) {
val angle = (angleBase * k).toFloat()
val c = kotlin.math.cos(angle)
val s = kotlin.math.sin(angle)
val tRe = re[i + half] * c - im[i + half] * s
val tIm = re[i + half] * s + im[i + half] * c
re[i + half] = re[i] - tRe
im[i + half] = im[i] - tIm
re[i] = re[i] + tRe
im[i] = im[i] + tIm
k += step
}
m += size
}
size *= 2
}
}
private suspend fun waitForPlaybackCompletion(
mp: android.media.MediaPlayer, segIdx: Int
) {

View File

@ -4,41 +4,47 @@ import android.content.Context
import android.graphics.Canvas
import android.graphics.Color
import android.graphics.Paint
import android.graphics.Path
import android.graphics.RadialGradient
import android.graphics.Shader
import android.util.AttributeSet
import android.view.Choreographer
import android.view.View
import kotlin.math.PI
import kotlin.math.cos
import kotlin.math.max
import kotlin.math.min
import kotlin.math.sin
import kotlin.math.sqrt
/**
* Épuré audio-reactive orb visualizer for the TTS + STT feedback loop.
* Large, central orb visualizer Kazeia's visual "face". Three
* distinct states, each tuned to feel different at a glance:
*
* Three states driven by [setIdle], [setListening], [startSpeaking]:
* - **Idle (calm)**: the orb quietly breathes a smooth scale
* oscillation 0.88 1.0 over a 5 s cycle with a soft halo that
* pulses in phase. No high-frequency motion. Suggests "waiting,
* listening, not anxious".
*
* - **Idle**: fixed orb with a slow respiratory pulsation (~4 s cycle)
* and a faint halo, matching the "chatbot is awake, waiting" vibe.
* Minimal GPU work a single draw per frame with easing precomputed.
* - **Listening (attentive)**: the orb settles slightly larger, a
* warmer bright ring appears around it, and its outline deforms
* organically with the live mic RMS (blob-like wobble, 8 Fourier
* modes, gain-mapped from the RMS). Micro-ripples emit
* continuously while speech is present. Feels alive and engaged
* clearly different from Idle's static breathing.
*
* - **Listening**: the orb grows and its halo brightens with the live
* mic RMS passed into [setListening]. Concentric micro-waves ripple
* outward to confirm the app is hearing the user, before STT has any
* result. Useful feedback during the ~1 s silence gap before Whisper
* fires.
* - **Speaking (active)**: the orb is rendered **as a contained
* spectrometer**. Inside the sphere boundary, SPECTRUM_BANDS
* vertical bars rise from a horizontal baseline according to a
* pre-computed band-energy sidecar. The sphere outline pulses
* with the overall RMS envelope. The bars are clipped to the
* sphere so it really looks like "the sphere itself is speaking"
* not an overlaid spectrogram. Strong amplitude peaks release
* outward ripple waves on the halo.
*
* - **Speaking**: amplitude and halo track the pre-computed TTS RMS
* envelope (one float per 50 ms) passed into [startSpeaking]. The view
* walks through the envelope using its own internal timer synced to
* [durationMs], so it doesn't need MediaPlayer.getCurrentPosition.
* Outward ripples fire on each envelope peak above the current floor.
*
* All animation runs on [Choreographer.FrameCallback]. At Idle, the
* frame callback self-throttles to ~20 fps (still smooth for a 4 s
* breathing cycle) to keep CPU cost near zero. During Listening and
* Speaking it runs at display refresh (60/90/120 fps).
* The whole palette (core, halo, ring, bars, ripples) is re-derived
* from a single [voiceColor] setter so each speaker gets a distinct
* visual identity.
*/
class AudioVisualizerView @JvmOverloads constructor(
context: Context,
@ -46,24 +52,20 @@ class AudioVisualizerView @JvmOverloads constructor(
defStyleAttr: Int = 0
) : View(context, attrs, defStyleAttr), Choreographer.FrameCallback {
// --- Configuration ---
// Colors picked for a calm, non-clinical feel. Soft lavender/blue
// core with a slightly warmer outer halo; all in the same hue family
// so transitions between states stay visually continuous.
private val coreColor = Color.parseColor("#BCA4E8") // soft lavender
private val haloColor = Color.parseColor("#8B6EC9") // deeper violet
private val rippleColor = Color.parseColor("#A48FDD") // between the two
companion object {
/** Must match Qwen3TtsEngine.SPECTRUM_BANDS. Asserted at setSpeaking. */
private const val SPECTRUM_BANDS = 12
/** Listening-mode outline deformation modes (even = smooth blobs). */
private const val BLOB_MODES = 8
}
// Amplitude gain so TTS signal ([0,1]) maps to perceptible size.
// Observed: normalized TTS RMS rarely exceeds ~0.5, so we stretch.
private val amplitudeGain = 1.8f
// --- State machine ---
// ---------- State ----------
private sealed class State {
object Idle : State()
data class Listening(var micRms: Float) : State()
data class Listening(var micRms: Float, var phaseSeed: Float) : State()
data class Speaking(
val envelope: FloatArray,
val spectrogram: Array<FloatArray>,
val durationMs: Long,
val startedAtMs: Long
) : State()
@ -71,20 +73,43 @@ class AudioVisualizerView @JvmOverloads constructor(
@Volatile private var state: State = State.Idle
// --- Animation state (mutated on UI thread from doFrame) ---
private var frameStartNs = 0L
private var lastFrameNs = 0L
private var smoothedAmp = 0f // exponential smoothing on amplitude
private val ripples = ArrayList<Ripple>()
private var lastEnvelopeIdx = -1
// ---------- Palette (derived from voiceColor) ----------
private var targetCore = 0xFFBCA4E8.toInt() // default: lavender
private var currentCore = targetCore
private var currentHalo = deriveHalo(currentCore)
private var currentAccent = deriveAccent(currentCore)
// Paints are allocated once; colors/alphas tweaked per frame.
fun setVoiceColor(color: Int) {
targetCore = color or 0xFF000000.toInt() // force opaque
scheduleFrame()
}
// ---------- Animation state ----------
private var frameStartNs = 0L
private var smoothedAmp = 0f // 0..1 orb-size pulsation (all states)
private var smoothedBars = FloatArray(SPECTRUM_BANDS)
private var listeningRingPhase = 0f // rotating shimmer on listening ring
private val ripples = ArrayList<Ripple>()
private var lastSpectroIdx = -1
// ---------- Paints ----------
private val corePaint = Paint(Paint.ANTI_ALIAS_FLAG).apply { style = Paint.Style.FILL }
private val haloPaint = Paint(Paint.ANTI_ALIAS_FLAG).apply { style = Paint.Style.FILL }
private val ringPaint = Paint(Paint.ANTI_ALIAS_FLAG).apply {
style = Paint.Style.STROKE
}
private val ripplePaint = Paint(Paint.ANTI_ALIAS_FLAG).apply {
style = Paint.Style.STROKE
strokeWidth = 4f
strokeWidth = 3f
}
private val barPaint = Paint(Paint.ANTI_ALIAS_FLAG).apply {
style = Paint.Style.FILL_AND_STROKE
}
private val blobOutlinePaint = Paint(Paint.ANTI_ALIAS_FLAG).apply {
style = Paint.Style.STROKE
}
private val blobPath = Path()
private val spherePath = Path()
init {
setLayerType(LAYER_TYPE_HARDWARE, null)
@ -93,25 +118,38 @@ class AudioVisualizerView @JvmOverloads constructor(
// ==================== Public API ====================
fun setIdle() {
state = State.Idle
if (state !is State.Idle) { state = State.Idle; lastSpectroIdx = -1 }
scheduleFrame()
}
fun setListening(micRms: Float) {
val clamped = micRms.coerceIn(0f, 1f)
val s = state
if (s is State.Listening) s.micRms = micRms.coerceIn(0f, 1f)
else state = State.Listening(micRms.coerceIn(0f, 1f))
if (s is State.Listening) {
s.micRms = clamped
} else {
state = State.Listening(clamped, (System.nanoTime() and 0xFFFF) / 65535f)
}
scheduleFrame()
}
fun startSpeaking(envelope: FloatArray, durationMs: Long) {
if (envelope.isEmpty() || durationMs <= 0) { setIdle(); return }
state = State.Speaking(envelope, durationMs, System.currentTimeMillis())
lastEnvelopeIdx = -1
fun startSpeaking(
envelope: FloatArray,
spectrogram: Array<FloatArray>,
durationMs: Long
) {
if (envelope.isEmpty() || spectrogram.isEmpty() || durationMs <= 0) {
setIdle(); return
}
state = State.Speaking(envelope, spectrogram, durationMs, System.currentTimeMillis())
lastSpectroIdx = -1
// Soft reset bar heights so the spectrum grows from zero rather
// than snapping to the idle smoothing residue.
for (i in smoothedBars.indices) smoothedBars[i] = 0f
scheduleFrame()
}
// ==================== View lifecycle ====================
// ==================== Lifecycle / scheduling ====================
override fun onAttachedToWindow() {
super.onAttachedToWindow()
@ -134,27 +172,30 @@ class AudioVisualizerView @JvmOverloads constructor(
override fun doFrame(frameTimeNanos: Long) {
frameScheduled = false
lastFrameNs = frameTimeNanos
// Ease the palette toward the target (voice change tween).
currentCore = lerpColor(currentCore, targetCore, 0.12f)
currentHalo = deriveHalo(currentCore)
currentAccent = deriveAccent(currentCore)
val s = state
when (s) {
is State.Idle -> {
// Self-throttled loop at ~20 fps for the breathing pulse.
Choreographer.getInstance().postFrameCallbackDelayed(this, 50)
// Self-throttled at 24 fps — enough for a 5 s breathing
// cycle to look continuous, keeps CPU cost near zero.
Choreographer.getInstance().postFrameCallbackDelayed(this, 40)
frameScheduled = true
}
is State.Listening -> {
listeningRingPhase += 0.015f
Choreographer.getInstance().postFrameCallback(this)
frameScheduled = true
}
is State.Speaking -> {
val elapsed = System.currentTimeMillis() - s.startedAtMs
if (elapsed >= s.durationMs + 300) {
// Auto-fallback to Idle if no explicit transition.
// The +300 ms grace lets the final envelope decay
// finish visibly before we snap back.
state = State.Idle
Choreographer.getInstance().postFrameCallbackDelayed(this, 50)
lastSpectroIdx = -1
Choreographer.getInstance().postFrameCallbackDelayed(this, 40)
frameScheduled = true
} else {
Choreographer.getInstance().postFrameCallback(this)
@ -170,96 +211,321 @@ class AudioVisualizerView @JvmOverloads constructor(
override fun onDraw(canvas: Canvas) {
super.onDraw(canvas)
val w = width.toFloat(); val h = height.toFloat()
if (w <= 0 || h <= 0) return
if (w <= 0f || h <= 0f) return
val cx = w / 2f; val cy = h / 2f
val maxR = min(w, h) * 0.42f
// Compute target amplitude in [0, 1] for the current state.
// 78% of min axis: large enough to feel central, 11% margin
// keeps ripples/ring from clipping.
val maxR = min(w, h) * 0.39f
val now = System.currentTimeMillis()
val target: Float = when (val s = state) {
is State.Idle -> {
// 4 s breathing cycle via a soft sine; amplitude 0 → 0.12.
val t = (now - frameStartNs / 1_000_000) % 4000L / 4000f
0.06f + 0.06f * (0.5f + 0.5f * sin((t * 2f * Math.PI).toFloat()))
}
is State.Listening -> {
// Base breathing + live mic contribution.
val t = (now - frameStartNs / 1_000_000) % 4000L / 4000f
val breath = 0.08f + 0.04f * (0.5f + 0.5f * sin((t * 2f * Math.PI).toFloat()))
breath + 0.55f * s.micRms
}
is State.Speaking -> {
val idxF = (now - s.startedAtMs).toFloat() *
s.envelope.size / s.durationMs.toFloat()
val idx = idxF.toInt().coerceIn(0, s.envelope.size - 1)
val frac = (idxF - idx).coerceIn(0f, 1f)
val a = s.envelope[idx]
val b = s.envelope[min(idx + 1, s.envelope.size - 1)]
val env = a + (b - a) * frac
// Emit a ripple whenever we cross a local peak above a
// floor, at most once per envelope step.
if (idx != lastEnvelopeIdx && env > 0.35f) {
val prev = if (idx > 0) s.envelope[idx - 1] else 0f
val next = if (idx < s.envelope.size - 1) s.envelope[idx + 1] else 0f
when (val s = state) {
is State.Idle -> drawIdle(canvas, cx, cy, maxR, now)
is State.Listening -> drawListening(canvas, cx, cy, maxR, now, s)
is State.Speaking -> drawSpeaking(canvas, cx, cy, maxR, now, s)
}
}
// ---------- Idle ----------
private fun drawIdle(canvas: Canvas, cx: Float, cy: Float, maxR: Float, now: Long) {
// 5 s breathing cycle, amplitude 0.88 → 1.00.
val t = ((now - frameStartNs / 1_000_000) % 5000L) / 5000f
val breath = 0.5f - 0.5f * cos((t * 2.0 * PI).toFloat()) // 0..1
val scale = 0.88f + 0.12f * breath
val radius = maxR * scale
smoothedAmp += ((breath * 0.5f) - smoothedAmp) * 0.1f
// Halo (soft, breathing in phase).
drawHalo(canvas, cx, cy, maxR * 1.15f * scale, alphaBase = 60, alphaGain = 70)
// Core — pure round, no deformation.
drawCore(canvas, cx, cy, radius, shimmer = 0f)
// Subtle inner highlight — feels alive without movement.
val hl = Paint(Paint.ANTI_ALIAS_FLAG).apply {
style = Paint.Style.FILL
shader = RadialGradient(
cx - radius * 0.25f, cy - radius * 0.25f, radius * 0.9f,
Color.argb(60, 255, 255, 255),
Color.argb(0, 255, 255, 255),
Shader.TileMode.CLAMP
)
}
canvas.drawCircle(cx, cy, radius, hl)
}
// ---------- Listening ----------
private fun drawListening(
canvas: Canvas, cx: Float, cy: Float, maxR: Float, now: Long, s: State.Listening
) {
// Base size slightly larger than Idle so the transition reads.
val baseScale = 0.93f + 0.08f * s.micRms
val radius = maxR * baseScale
smoothedAmp += (s.micRms - smoothedAmp) * 0.25f
// Halo — brighter than Idle, responds to mic.
drawHalo(canvas, cx, cy, maxR * 1.22f * baseScale,
alphaBase = 90, alphaGain = (130 * s.micRms).toInt().coerceIn(0, 160))
// Deformed outline (blob): Fourier modes over the circle.
buildBlobPath(blobPath, cx, cy, radius, s.micRms, s.phaseSeed, now)
// Filled core with a radial gradient inside the blob path.
corePaint.shader = RadialGradient(
cx - radius * 0.15f, cy - radius * 0.25f, radius * 1.1f,
currentCore, deriveCoreEdge(currentCore),
Shader.TileMode.CLAMP
)
canvas.save()
canvas.clipPath(blobPath)
canvas.drawCircle(cx, cy, radius * 1.3f, corePaint)
canvas.restore()
// Outline of the blob, slightly thicker as RMS rises.
blobOutlinePaint.strokeWidth = 2f + 2f * s.micRms
blobOutlinePaint.color = withAlpha(currentAccent, 180)
canvas.drawPath(blobPath, blobOutlinePaint)
// Rotating shimmer ring — a thin arc segment chasing around.
drawListeningRing(canvas, cx, cy, radius * 1.08f, s.micRms)
// Continuous micro-ripples while listening.
val rmsMicroFloor = 0.12f
if (s.micRms > rmsMicroFloor && ((now / 90) % 3 == 0L)) {
ripples.add(Ripple(bornAtMs = now, peak = s.micRms))
}
drawRipples(canvas, cx, cy, maxR, now, listeningMode = true)
}
private fun drawListeningRing(
canvas: Canvas, cx: Float, cy: Float, radius: Float, rms: Float
) {
// Thin shimmer arc rotating around the orb, width/alpha scaling
// with mic RMS so silence shows almost nothing.
if (rms < 0.04f) return
ringPaint.strokeWidth = 2.5f + 3f * rms
val sweep = 60f + 80f * rms
val start = (listeningRingPhase * 360f) % 360f
ringPaint.color = withAlpha(currentAccent, (140 + 110 * rms).toInt().coerceIn(0, 250))
val r = radius
canvas.drawArc(cx - r, cy - r, cx + r, cy + r, start, sweep, false, ringPaint)
// Subtle tail: a second, dimmer, shorter arc slightly offset.
ringPaint.color = withAlpha(currentAccent, (60 + 60 * rms).toInt().coerceIn(0, 160))
canvas.drawArc(cx - r, cy - r, cx + r, cy + r, start + sweep + 8f, sweep * 0.5f, false, ringPaint)
}
// ---------- Speaking ----------
private fun drawSpeaking(
canvas: Canvas, cx: Float, cy: Float, maxR: Float, now: Long, s: State.Speaking
) {
// Envelope → orb size pulsation, spectrogram → bars inside.
val elapsed = now - s.startedAtMs
val envIdxF = elapsed.toFloat() * s.envelope.size / s.durationMs
val envIdx = envIdxF.toInt().coerceIn(0, s.envelope.size - 1)
val envFrac = (envIdxF - envIdx).coerceIn(0f, 1f)
val env = lerp(
s.envelope[envIdx],
s.envelope[min(envIdx + 1, s.envelope.size - 1)],
envFrac
)
smoothedAmp += (env - smoothedAmp) * 0.30f
val scale = 0.92f + 0.16f * smoothedAmp
val radius = maxR * scale
// Halo pulses with amp; emit ripples on peaks.
drawHalo(canvas, cx, cy, maxR * 1.25f * scale,
alphaBase = 80, alphaGain = (140 * smoothedAmp).toInt().coerceIn(0, 200))
if (envIdx != lastSpectroIdx && env > 0.45f) {
val prev = if (envIdx > 0) s.envelope[envIdx - 1] else 0f
val next = if (envIdx < s.envelope.size - 1) s.envelope[envIdx + 1] else 0f
if (env >= prev && env >= next) {
ripples.add(Ripple(bornAtMs = now, peak = env))
}
lastEnvelopeIdx = idx
}
(env * amplitudeGain).coerceIn(0f, 1f)
lastSpectroIdx = envIdx
}
drawRipples(canvas, cx, cy, maxR, now, listeningMode = false)
// Sphere body — pure circle here, serves as the container for
// the spectrum bars.
spherePath.rewind()
spherePath.addCircle(cx, cy, radius, Path.Direction.CW)
corePaint.shader = RadialGradient(
cx - radius * 0.25f, cy - radius * 0.30f, radius * 1.2f,
currentCore, deriveCoreEdge(currentCore),
Shader.TileMode.CLAMP
)
canvas.drawPath(spherePath, corePaint)
// Spectrum bars, clipped to the sphere so they appear *inside*.
canvas.save()
canvas.clipPath(spherePath)
drawSpectrumBars(canvas, cx, cy, radius, s, elapsed)
canvas.restore()
// Outline ring on top so the sphere edge stays crisp after bar
// clipping.
blobOutlinePaint.strokeWidth = 2f + 3f * smoothedAmp
blobOutlinePaint.color = withAlpha(currentAccent, 220)
canvas.drawCircle(cx, cy, radius, blobOutlinePaint)
}
// Exponential smoothing so frame-to-frame changes feel organic.
smoothedAmp += (target - smoothedAmp) * 0.25f
private fun drawSpectrumBars(
canvas: Canvas, cx: Float, cy: Float, radius: Float,
s: State.Speaking, elapsed: Long
) {
val nBands = SPECTRUM_BANDS
val timeIdxF = elapsed.toFloat() * s.spectrogram.size / s.durationMs
val timeIdx = timeIdxF.toInt().coerceIn(0, s.spectrogram.size - 1)
val timeFrac = (timeIdxF - timeIdx).coerceIn(0f, 1f)
// --- Halo (radial gradient, grows with amplitude) ---
val haloR = maxR * (0.85f + 0.35f * smoothedAmp)
val haloAlpha = (80 + 100 * smoothedAmp).toInt().coerceIn(0, 200)
// Smoothly interpolate between adjacent spectrogram columns,
// and exponentially smooth toward the target to keep bars
// fluid even at 60 fps with 20 fps spectrogram data.
for (b in 0 until nBands) {
val a = s.spectrogram[timeIdx][b]
val c = s.spectrogram[min(timeIdx + 1, s.spectrogram.size - 1)][b]
val target = lerp(a, c, timeFrac)
smoothedBars[b] += (target - smoothedBars[b]) * 0.35f
}
// Bars fill the bottom ~75% of the sphere diameter. Each bar is
// a rounded rectangle rising from a horizontal baseline at
// ~60% of the sphere height (slightly below center — feels more
// natural like a real EQ).
val spanW = radius * 1.55f
val gap = spanW / nBands * 0.25f
val barW = (spanW - gap * (nBands - 1)) / nBands
val leftX = cx - spanW / 2f
val baseline = cy + radius * 0.60f
val maxBarH = radius * 1.20f
val cornerR = barW * 0.45f
for (b in 0 until nBands) {
val v = smoothedBars[b].coerceIn(0f, 1f)
// Mirror the bands around the center so low bass is in the
// middle, highs on the edges — visually centred.
val displayIdx = if (b % 2 == 0) nBands / 2 + b / 2 else nBands / 2 - 1 - b / 2
val x = leftX + displayIdx * (barW + gap)
val barH = maxBarH * v
// Color gradient: brighter toward the top.
barPaint.color = withAlpha(brighten(currentAccent, 0.3f + 0.4f * v),
(180 + 70 * v).toInt().coerceIn(0, 255))
canvas.drawRoundRect(
x, baseline - barH,
x + barW, baseline,
cornerR, cornerR, barPaint
)
}
// Soft horizontal baseline (thin line) so silent bars still
// hint at a spectrometer rather than an empty circle.
barPaint.color = withAlpha(currentAccent, 90)
canvas.drawRect(leftX, baseline - 1.2f, leftX + spanW, baseline + 1.2f, barPaint)
}
// ---------- Helpers: halo / ripples / blob ----------
private fun drawHalo(
canvas: Canvas, cx: Float, cy: Float, r: Float,
alphaBase: Int, alphaGain: Int
) {
val a = (alphaBase + alphaGain).coerceIn(0, 255)
haloPaint.shader = RadialGradient(
cx, cy, haloR,
intArrayOf(
Color.argb(haloAlpha, Color.red(haloColor), Color.green(haloColor), Color.blue(haloColor)),
Color.argb(0, Color.red(haloColor), Color.green(haloColor), Color.blue(haloColor))
),
cx, cy, r,
intArrayOf(withAlpha(currentHalo, a), withAlpha(currentHalo, 0)),
floatArrayOf(0f, 1f),
Shader.TileMode.CLAMP
)
canvas.drawCircle(cx, cy, haloR, haloPaint)
canvas.drawCircle(cx, cy, r, haloPaint)
}
// --- Ripples ---
if (ripples.isNotEmpty()) {
private fun drawCore(canvas: Canvas, cx: Float, cy: Float, radius: Float, shimmer: Float) {
corePaint.shader = RadialGradient(
cx - radius * 0.2f, cy - radius * 0.3f, radius * 1.15f,
currentCore, deriveCoreEdge(currentCore),
Shader.TileMode.CLAMP
)
canvas.drawCircle(cx, cy, radius, corePaint)
}
private fun drawRipples(
canvas: Canvas, cx: Float, cy: Float, maxR: Float, now: Long, listeningMode: Boolean
) {
if (ripples.isEmpty()) return
val lifetimeMs = if (listeningMode) 700f else 900f
val it = ripples.iterator()
while (it.hasNext()) {
val r = it.next()
val age = (now - r.bornAtMs) / 900f // 900 ms lifetime
val age = (now - r.bornAtMs) / lifetimeMs
if (age >= 1f) { it.remove(); continue }
val radius = maxR * (0.55f + 0.6f * age)
val alpha = ((1f - age) * 140f * r.peak).toInt().coerceIn(0, 200)
ripplePaint.color = Color.argb(
alpha,
Color.red(rippleColor),
Color.green(rippleColor),
Color.blue(rippleColor)
)
ripplePaint.strokeWidth = max(1.5f, (1f - age) * 5f)
val radius = maxR * (0.58f + 0.62f * age)
val alpha = ((1f - age) * 150f * r.peak).toInt().coerceIn(0, 200)
ripplePaint.color = withAlpha(currentAccent, alpha)
ripplePaint.strokeWidth = max(1.2f, (1f - age) * 4f)
canvas.drawCircle(cx, cy, radius, ripplePaint)
}
}
// --- Core orb ---
val coreR = maxR * (0.45f + 0.25f * smoothedAmp)
corePaint.shader = RadialGradient(
cx, cy, coreR,
intArrayOf(
Color.argb(255, Color.red(coreColor), Color.green(coreColor), Color.blue(coreColor)),
Color.argb(180, Color.red(haloColor), Color.green(haloColor), Color.blue(haloColor))
),
floatArrayOf(0f, 1f),
Shader.TileMode.CLAMP
)
canvas.drawCircle(cx, cy, coreR, corePaint)
/**
* Build an organic blob path by displacing a circle with a sum of
* low-frequency sine modes. Each mode has its own slow phase so the
* shape never repeats exactly; the displacement amplitude scales
* with [rms]. 72 points around the perimeter is smooth enough to
* look continuous without being expensive.
*/
private fun buildBlobPath(
path: Path, cx: Float, cy: Float, radius: Float,
rms: Float, phaseSeed: Float, now: Long
) {
path.rewind()
val steps = 72
val tSec = now / 1000f
val amp = radius * (0.02f + 0.08f * rms)
for (i in 0..steps) {
val theta = (i % steps).toFloat() / steps * 2f * PI.toFloat()
var d = 0f
for (m in 1..BLOB_MODES) {
val phase = phaseSeed * 6.28f + tSec * (0.3f + 0.05f * m)
d += (amp / m) * sin((m * theta + phase).toDouble()).toFloat()
}
val r = radius + d
val x = cx + r * cos(theta.toDouble()).toFloat()
val y = cy + r * sin(theta.toDouble()).toFloat()
if (i == 0) path.moveTo(x, y) else path.lineTo(x, y)
}
path.close()
}
// ---------- Color helpers ----------
private fun deriveHalo(core: Int): Int = darken(core, 0.18f)
private fun deriveAccent(core: Int): Int = brighten(core, 0.12f)
private fun deriveCoreEdge(core: Int): Int = darken(core, 0.12f)
private fun brighten(c: Int, frac: Float): Int {
val r = (Color.red(c) + (255 - Color.red(c)) * frac).toInt().coerceIn(0, 255)
val g = (Color.green(c) + (255 - Color.green(c)) * frac).toInt().coerceIn(0, 255)
val b = (Color.blue(c) + (255 - Color.blue(c)) * frac).toInt().coerceIn(0, 255)
return Color.argb(Color.alpha(c), r, g, b)
}
private fun darken(c: Int, frac: Float): Int {
val r = (Color.red(c) * (1 - frac)).toInt().coerceIn(0, 255)
val g = (Color.green(c) * (1 - frac)).toInt().coerceIn(0, 255)
val b = (Color.blue(c) * (1 - frac)).toInt().coerceIn(0, 255)
return Color.argb(Color.alpha(c), r, g, b)
}
private fun withAlpha(c: Int, alpha: Int): Int {
return Color.argb(alpha.coerceIn(0, 255), Color.red(c), Color.green(c), Color.blue(c))
}
private fun lerp(a: Float, b: Float, t: Float): Float = a + (b - a) * t
private fun lerpColor(from: Int, to: Int, t: Float): Int {
val r = lerp(Color.red(from).toFloat(), Color.red(to).toFloat(), t).toInt().coerceIn(0, 255)
val g = lerp(Color.green(from).toFloat(), Color.green(to).toFloat(), t).toInt().coerceIn(0, 255)
val b = lerp(Color.blue(from).toFloat(), Color.blue(to).toFloat(), t).toInt().coerceIn(0, 255)
return Color.argb(255, r, g, b)
}
private class Ripple(val bornAtMs: Long, val peak: Float)

View File

@ -187,6 +187,21 @@ class ChatActivity : AppCompatActivity() {
"Amir", "Didier", "Sid", "Zelda"
)
/** One color per speaker derived palette (core + halo + bars) is
* generated inside AudioVisualizerView. Chosen to be calm,
* perceptually distinct, and consistent in saturation so switching
* voices changes *hue* rather than *mood*. */
private val voiceColors = listOf(
0xFFBCA4E8.toInt(), // Damien — lavender
0xFFE8A4CC.toInt(), // Elodie — rose
0xFF82D5D0.toInt(), // Jerome — aqua
0xFFE8BFA4.toInt(), // Richard — amber sand
0xFF95D5A6.toInt(), // Amir — emerald
0xFF8FA2D4.toInt(), // Didier — indigo
0xFFE8B89A.toInt(), // Sid — peach
0xFFA4BEE8.toInt() // Zelda — periwinkle
)
private fun setupResourceMonitoring() {
val graphCpu = findViewById<MiniGraphView>(R.id.graphCpu)
val graphGpu = findViewById<MiniGraphView>(R.id.graphGpu)
@ -254,6 +269,12 @@ class ChatActivity : AppCompatActivity() {
override fun onItemSelected(parent: AdapterView<*>?, view: android.view.View?, pos: Int, id: Long) {
val voicePath = "${com.kazeia.KazeiaApplication.MODELS_DIR}/../voix/${voiceFiles[pos]}"
kazeiaService?.setVoice(voicePath)
// Push the matching color to the service so the orb
// view picks it up; the view tweens from the previous
// color so voice changes don't snap visually.
val color = voiceColors[pos.coerceIn(voiceColors.indices)]
kazeiaService?.setVoiceColor(color)
binding.audioViz.setVoiceColor(color)
appendLog("Voix: ${voiceNames[pos]}")
}
override fun onNothingSelected(parent: AdapterView<*>?) {}
@ -346,13 +367,23 @@ class ChatActivity : AppCompatActivity() {
}
is com.kazeia.service.KazeiaService.VisualizerSignal.Speaking -> {
if (sig.rmsEnvelope !== lastSpeakingEnv) {
binding.audioViz.startSpeaking(sig.rmsEnvelope, sig.durationMs)
binding.audioViz.startSpeaking(
sig.rmsEnvelope, sig.spectrogram, sig.durationMs
)
lastSpeakingEnv = sig.rmsEnvelope
}
}
}
}
}
launch {
// Keep the view's voice color synchronised with the
// service — covers the initial state when the view
// attaches before the spinner's first callback fires.
service.voiceColor.collect { color ->
binding.audioViz.setVoiceColor(color)
}
}
}
}
}

View File

@ -100,17 +100,22 @@
</LinearLayout>
<!-- Audio-reactive orb visualizer: Kazeia's visual presence.
Shows a breathing baseline at Idle, grows with mic RMS while
Listening, and reacts to the TTS envelope while Speaking. -->
<!-- Central orb visualizer: Kazeia's visual "face". Takes the
top half of the chat area so it reads as the primary UI
element; the message list sits below it and shows the
word-by-word reveal of the current reply. Color is driven
by the selected voice (Damien=lavender, Elodie=rose, …). -->
<com.kazeia.ui.AudioVisualizerView
android:id="@+id/audioViz"
android:layout_width="0dp"
android:layout_height="140dp"
android:layout_height="0dp"
android:background="@color/kazeia_background"
app:layout_constraintTop_toBottomOf="@id/voiceBar"
app:layout_constraintBottom_toTopOf="@id/rvMessages"
app:layout_constraintStart_toStartOf="parent"
app:layout_constraintEnd_toEndOf="parent" />
app:layout_constraintEnd_toEndOf="parent"
app:layout_constraintVertical_chainStyle="spread"
app:layout_constraintVertical_weight="3" />
<!-- Chat messages -->
<androidx.recyclerview.widget.RecyclerView
@ -122,7 +127,8 @@
app:layout_constraintTop_toBottomOf="@id/audioViz"
app:layout_constraintBottom_toTopOf="@id/inputBar"
app:layout_constraintStart_toStartOf="parent"
app:layout_constraintEnd_toEndOf="parent" />
app:layout_constraintEnd_toEndOf="parent"
app:layout_constraintVertical_weight="2" />
<!-- Input bar -->
<LinearLayout