🐴

Happy Horse 1.0

Open-Source AI Video Generation Model

A 15-billion-parameter unified Transformer that jointly produces video and synchronized audio from text or image prompts, with cinematic 1080p quality and seven-language lip-sync.

15B
Parameters
40
Transformer Layers
38s
Generation Speed
7
Lip-Sync Languages

Core Capabilities

Unified Transformer

40-layer self-attention network with 4 modality-specific layers.

Joint Video + Audio

Generates synchronized dialogue and ambient sound.

8-Step Distillation

Reduces denoising to just 8 steps.

Multilingual Lip-Sync

English, Mandarin, Japanese, Korean, German, French.

1080p Output

5-8 second clips at 1080p.

Open & Self-Hostable

Commercial-use permission included.

Benchmarks & Performance

Ranked #1 globally on the Artificial Analysis Video Arena with Elo 1333.

ModelVisualAlignmentPhysicalWER (%)
Happy Horse 1.04.804.184.5214.60
OVI 1.14.734.104.4140.45
LTX 2.34.764.124.5619.23

Win rate: 80.0% vs OVI 1.1 · 60.9% vs LTX 2.3

Deploy HappyHorse 1.0

Runs on NVIDIA H100 or A100 (≥48GB VRAM recommended).

# Clone & install
git clone https://github.com/happy-horse/happyhorse-1.git
cd happyhorse-1
pip install -r requirements.txt

# Download weights
bash download_weights.sh

# Generate
python demo_generate.py --prompt "a robot dancing on the moon" --duration 5