kazeia/executorch-custom
Kazeia Team 14f7e5b05f Optimize CP+talker: eliminate prepare_input_tensors per step
Cache input tensor pointers after first prepare_input_tensors call,
then memcpy directly into them for all subsequent steps.

Eliminates ~14000 mallocs per pipeline run (986 CP + 58 talker calls).
Generation: 4640ms → 4007ms (-633ms), total RTF: 1.6 → 1.51

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 12:16:38 +02:00
..
Module.java Shared Module C++ pipeline: RTF 1.6 with perfect quality 2026-04-09 12:05:58 +02:00
cp_et_runner.cpp Native C++ pipeline: RTF 1.4 (was 3.6 in Java) 2026-04-09 10:09:32 +02:00
cp_et_test_client.cpp Native C++ pipeline: RTF 1.4 (was 3.6 in Java) 2026-04-09 10:09:32 +02:00
jni_layer_tts.cpp Optimize CP+talker: eliminate prepare_input_tensors per step 2026-04-09 12:16:38 +02:00
tts_pipeline_jni.cpp Disable C++ pipeline (QNN non-deterministic), keep Java RTF 1.8 2026-04-09 11:42:49 +02:00