Linux CUDA first, WSL2 and cloud GPU as fallback routes

How to Install ZONOS2 Locally

Start with the official Linux x86_64 and NVIDIA CUDA path, then use the command generator to choose the least painful setup for your machine.

8Btotal parameters
900Mactive parameters
6M+training audio hours
Tier 1English, Mandarin, Japanese

Command generator

Choose Your ZONOS2 Install Route

Pick Linux CUDA for the cleanest route. Use WSL2 only if you already know how to debug NVIDIA passthrough. Use cloud GPU when you want speed over local setup for multilingual ZONOS2 TTS, Japanese voice cloning, and Mandarin Chinese speech tests.

git clone https://github.com/Zyphra/ZONOS2.git
cd ZONOS2
uv sync
uv run python -m minisgl --model-path Zyphra/ZONOS2 --tts-default-voices-dir ./default_voices/

Ready.

System requirements

Can My GPU Run ZONOS2?

Local inference is aimed at Linux x86_64 with NVIDIA CUDA. Use this quick checker to choose local, WSL2, or cloud GPU.

Enter your OS and VRAM.

Copyable setup

How to Install ZONOS2 Locally

The shortest official path is Linux plus NVIDIA CUDA. Windows users should consider WSL2 only if they are comfortable debugging GPU passthrough.

Linux CUDA

git clone https://github.com/Zyphra/ZONOS2.git
cd ZONOS2
uv sync
uv run python -m minisgl --model-path Zyphra/ZONOS2 --tts-default-voices-dir ./default_voices/

Generate Speech

curl -X POST http://localhost:1919/tts/generate \
  -H "Content-Type: application/json" \
  -d '{"text":"Hello from ZONOS2","stream":true}' \
  --output output.pcm
ffmpeg -f f32le -ar 44100 -ac 1 -i output.pcm output.wav

WSL2 Route

wsl --install
# Install an NVIDIA driver with WSL CUDA support on Windows.
# Inside Ubuntu on WSL2:
nvidia-smi
uv --version
# Then follow the Linux CUDA commands.

Multilingual ZONOS2 workflow

ZONOS2 Multilingual TTS for English, Japanese, and Mandarin Chinese

This Install page targets users searching for ZONOS2 multilingual TTS, ZONOS2 voice cloning, ZONOS2 Japanese voice cloning, ZONOS2 Mandarin Chinese speech, and ZONOS2 English narration. Keep tests short, compare language output side by side, and verify consent before using any cloned voice.

EN

English ZONOS2 narration

Use ZONOS2 TTS for product explainers, YouTube voiceovers, podcasts, API demos, and developer documentation where clear English pacing matters.

JA

Japanese ZONOS2 voice cloning

Use ZONOS2 Japanese TTS for anime-style dialogue tests, game character lines, VTuber scripts, localization drafts, and language-learning examples.

ZH

Mandarin Chinese ZONOS2 TTS

Use ZONOS2 Mandarin Chinese speech for bilingual demos, creator narration, app onboarding, education content, and Chinese voice cloning experiments.

API

Multilingual API planning

Store language, reference voice, prompt text, consent status, and output settings together so every ZONOS2 multilingual generation is traceable.

FAQ

ZONOS2 FAQ

What platform should I use first?

Use Linux x86_64 with an NVIDIA GPU and matching CUDA toolkit when possible.

Can Windows users run it?

Windows users should treat WSL2 as an advanced route. Cloud GPU is often faster to debug.

What port does the local server use?

The documented local server starts on localhost port 1919 by default.