Commit Graph

2 Commits

Author SHA1 Message Date
Kazeia Team 8bfe6c7445 Add NEON SIMD heads argmax for CP — 2.3× speedup
CP head dot products (15 × 2048×1024) optimized with ARM NEON
vfmaq_f32 (4 accumulators, 16 floats/iteration).

CP/frame: 131ms → 58ms, total pipeline: 22.7s → 14.7s (RTF 3.2)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 08:55:20 +02:00
Kazeia Team 389ffa7c61 Initial commit: Kazeia TTS pipeline on NPU via ExecuTorch
Full Qwen3-TTS-0.6B pipeline running on Snapdragon 8 Elite NPU:
  - Talker (28L) and Code Predictor (5L) as .pte on QNN HTP fp16
  - JNI integration, no root required
  - Validated audio quality: RTF 3.9

  Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 08:42:11 +02:00