Resemble AI has open-sourced its advanced voice generation model, DramaBox, on Hugging Face, marking a significant leap in AI voice technology. DramaBox is the first voice engine designed for director-level control, allowing users to input stage directions such as sighs or whispers alongside dialogue. This transforms AI-generated voices from robotic outputs to emotionally rich performances, eliminating the need for human voice actors or extensive post-production. DramaBox features zero-shot voice cloning, requiring only 10 seconds of reference audio to mimic a target voice. It also allows users to set a character's age, accent, and emotion through natural language prompts, producing studio-quality 48kHz stereo audio. To prevent misuse, all audio includes an invisible watermark resistant to compression and editing. The model is built on Lightricks’ LTX-2.3 audio foundation and integrates advanced technologies like Diffusion Transformer and Gemma 3 12B for text processing.