Skip to content

EmbodiedCity/Worldscape-MoE.code

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Worldscape-MoE

A Unified Mixture-of-Experts Architecture for Scalable Multi-Control Video Generation World Modeling

Jianjie Fang, Yongyan Xu, Ziyou Wang, Yuchao Huang, Zhaolu Wang, Rongze Tang, Mingyuan Jia, Baining Zhao, Weichen Zhang, Xin Zhang, Haisheng Su, Yu Shang, Chen Gao, Wei Wu, Xinlei Chen, Yong Li

Project Page Demo Video Model

Paper, arXiv, and model weights will be released soon.

Overview

Worldscape-MoE is a unified world-model training framework for multi-control video generation. It introduces a Mixture-of-Experts design into Diffusion Transformers to learn from heterogeneous supervisory controls, including camera poses, robotic arms, and hand joints, within a single extensible world model.

By combining shared experts for cross-control world knowledge with modality-specific experts for control specialization, Worldscape-MoE aims to scale embodied and interactive world modeling beyond single-control supervision.

Demo

Worldscape-MoE demo preview

Watch the full demo on YouTube

Citation

If you find this project useful, please consider citing:

@misc{fang2026worldscapemoe,
  title        = {Worldscape-MoE: A Unified Mixture-of-Experts Architecture for Scalable Multi-Control Video Generation World Modeling},
  author       = {Fang, Jianjie and Xu, Yongyan and Wang, Ziyou and Huang, Yuchao and Wang, Zhaolu and Tang, Rongze and Jia, Mingyuan and Zhao, Baining and Zhang, Weichen and Zhang, Xin and Su, Haisheng and Shang, Yu and Gao, Chen and Wu, Wei and Chen, Xinlei and Li, Yong},
  year         = {2026},
  note         = {Project page: https://embodiedcity.github.io/Worldscape-MoE/}
}

About

The first MoE-based framework for unified and scalable world modeling.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors