LEHJA — status

Live voice translation + voice clone · deploying on 2× RTX A6000 with the winning translator · Updated 2026-08-03 17:35:01 UTC · auto-refresh 20s

🔄 Deploying — Qwen3.6-35B-A3B (bench winner) + trained voices

🔄 1. Set up A6000 (torch, CosyVoice, whisper, ollama)15%

stage: installing deps

⬜ 2. Download Qwen3.6-35B-A3B (the MT winner, ~23GB)0%

downloading

⬜ 3. Transfer trained Urdu + Arabic voice models0%

transferring

⬜ 4. Launch pipeline service (STT GPU0 · MT GPU1)0%

waiting for deps + model

⬜ 5. Re-point lehja.app/test + quality test0%

pending

Split: whisper(STT)+CosyVoice(cloned-voice TTS) on GPU 0, Qwen3.6-35B-A3B (MT) on GPU 1 → run in parallel, no VRAM contention. Each A6000 = 48GB.

✅ Already done

✅ Urdu voice model — trained, validated PASS (CER 0.065)
✅ Arabic Quranic voice v2 — trained, owner-validated 95% (canonical-Quran-aligned, full tashkeel)
✅ MT bench — Qwen3.6-35B-A3B WINNER (now deploying it) · Qwen3-14B baseline · gemma3:4b too weak
✅ First deploy on shared 24GB 4090 — worked but VRAM-capped MT quality (the reason we moved to A6000)
✅ H100 training box — done + destroyed (models backed up on VPS + transferred here)
⊘ QTM audio re-transcription — cancelled by owner (DB untouched)

🗺 Why the A6000 (vs prior boxes)

2× A6000 = 48GB/GPU → fits the 35B winner on one card with room; 650 GB/s (5× the GB10); 2 GPUs → real 2–4 user concurrency. Shared 24GB 4090 couldn't fit good MT (FORAN took half). H100 was faster but gone + 3× price.

🎙 LEHJA — build dashboard

🔄 Deploying — Qwen3.6-35B-A3B (bench winner) + trained voices

✅ Already done

🗺 Why the A6000 (vs prior boxes)