kazeia

History

Kazeia Team a688edc9ec Reduce talker KV_LEN 100→64: saves 148ms (RTF 1.31) KV window of 64 sufficient for ~70 token generation (10 prefill + 58 gen). 36% less KV memcpy per talker step (28L × 2 × 64×8×128 vs 100×8×128). Generation: 3795ms → 3647ms, total: 6438ms → 6093ms Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>		2026-04-09 12:47:30 +02:00
..
Module.java	Shared Module C++ pipeline: RTF 1.6 with perfect quality	2026-04-09 12:05:58 +02:00
cp_et_runner.cpp	Native C++ pipeline: RTF 1.4 (was 3.6 in Java)	2026-04-09 10:09:32 +02:00
cp_et_test_client.cpp	Native C++ pipeline: RTF 1.4 (was 3.6 in Java)	2026-04-09 10:09:32 +02:00
jni_layer_tts.cpp	Reduce talker KV_LEN 100→64: saves 148ms (RTF 1.31)	2026-04-09 12:47:30 +02:00
tts_pipeline_jni.cpp	Disable C++ pipeline (QNN non-deterministic), keep Java RTF 1.8	2026-04-09 11:42:49 +02:00