๐ LEHJA โ build dashboard
Live voice translation + voice clone ยท deploying on 2ร RTX A6000 with the winning translator ยท Updated 2026-06-14 16:49:01 UTC ยท auto-refresh 20s
OVERALL DEPLOYMENT 100%
Serving box: 2ร RTX A6000 (96 GB) ยท GPU0: 6445/49140 MiB ยท GPU1: 29765/49140 MiB ยท ollama(35B) up
๐ Deploying โ Qwen3.6-35B-A3B (bench winner) + trained voices
โ
1. Set up A6000 (torch, CosyVoice, whisper, ollama)100%
stage: SETUP_A6000_DONE
โ
2. Download Qwen3.6-35B-A3B (the MT winner, ~23GB)100%
complete
โ
3. Transfer trained Urdu + Arabic voice models100%
CosyVoice2-0.5B-ar, CosyVoice2-0.5B-ur
โ
4. Launch pipeline service (STT GPU0 ยท MT GPU1)100%
up on :8770
โ
5. Re-point lehja.app/test + quality test100%
LIVE โ test it
Split: whisper(STT)+CosyVoice(cloned-voice TTS) on GPU 0, Qwen3.6-35B-A3B (MT) on GPU 1 โ run in parallel, no VRAM contention. Each A6000 = 48GB.
โ
Already done
โ
Urdu voice model โ trained, validated PASS (CER 0.065)
โ
Arabic Quranic voice v2 โ trained, owner-validated 95% (canonical-Quran-aligned, full tashkeel)
โ
MT bench โ Qwen3.6-35B-A3B WINNER (now deploying it) ยท Qwen3-14B baseline ยท gemma3:4b too weak
โ
First deploy on shared 24GB 4090 โ worked but VRAM-capped MT quality (the reason we moved to A6000)
โ
H100 training box โ done + destroyed (models backed up on VPS + transferred here)
โ QTM audio re-transcription โ cancelled by owner (DB untouched)
๐บ Why the A6000 (vs prior boxes)
2ร A6000 = 48GB/GPU โ fits the 35B winner on one card with room; 650 GB/s (5ร the GB10); 2 GPUs โ real 2โ4 user concurrency. Shared 24GB 4090 couldn't fit good MT (FORAN took half). H100 was faster but gone + 3ร price.