kazeia

History

Kazeia Team db281002d9 scripts: export per-voice prefix/suffix embeddings New tool + generated artefacts so the on-device voice spinner can now hot-swap between all 8 voices — previously only Damien's prefix/suffix were present in the model dir, and the tablet fell back to him regardless of selection. scripts/export_voice_prefix_suffix.py runs Qwen3TTS's voice-clone path under a forward hook, captures the first prefill call's 1024-dim talker input embeddings, aborts the rest of the (very slow on CPU) decode via a sentinel exception, and slices out the first 9 vectors as <name>_voice_prefix.bin and the last 2 as <name>_voice_suffix.bin. Validated against the shipped damien_voice_prefix.bin: using damien_15s_24k.wav as the reference audio, max\|diff\| = 0, so the extraction matches the original tooling bit-for-bit. Generated and adb-pushed to /data/local/tmp/kazeia/models/qwen3-tts-npu/: amir / didier / elodie / jerome / richard / sid / zelda (+ re-generated damien from the canonical 15s_24k reference) Qwen3TtsEngine.setVoice (already wired) reads <voice>_voice_prefix.bin / <voice>_voice_suffix.bin by basename, so voice changes now take effect from the next synthesized segment with no app restart. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>		2026-04-15 00:09:23 +02:00
..
cp_et_runner.cpp	Initial commit: Kazeia TTS pipeline on NPU via ExecuTorch	2026-04-09 08:42:11 +02:00
export_cp_pte.py	Initial commit: Kazeia TTS pipeline on NPU via ExecuTorch	2026-04-09 08:42:11 +02:00
export_talker_pte.py	Restore KV=100 + fix as-is embeds + multi-segment support	2026-04-09 22:26:20 +02:00
export_tts_text_embeddings.py	TTS: conditional tail-trim + export script accepts voice path arg	2026-04-13 11:32:33 +02:00
export_voice_prefix_suffix.py	scripts: export per-voice prefix/suffix embeddings	2026-04-15 00:09:23 +02:00
prepare_tts_embeds.py	Add prepare_tts_embeds.py for any text + codec_sum fix	2026-04-09 14:05:42 +02:00
prepare_tts_native.py	TTS tremor investigation: identify cross-arch numerical floor, gate diag flags	2026-04-13 00:15:14 +02:00
prepare_tts_segments.py	TTS Stage 1 streaming: play each segment the moment it's decoded	2026-04-13 08:43:30 +02:00
prepare_tts_voiceclone.py	TTS tremor investigation: identify cross-arch numerical floor, gate diag flags	2026-04-13 00:15:14 +02:00
qc_schema_serialize_patched.py	Initial commit: Kazeia TTS pipeline on NPU via ExecuTorch	2026-04-09 08:42:11 +02:00
test_cp_et_quality.py	Initial commit: Kazeia TTS pipeline on NPU via ExecuTorch	2026-04-09 08:42:11 +02:00