WorldComposer

From Seeing to Simulating: Generative High-Fidelity Simulation with Digital Cousins for Generalizable Robot Learning and Evaluation

Jasper Lu1*, Zhenhao Shen1*, Yuanfei Wang1*, Shugao Liu1, Shengqiang Xu1, Shawn Xie1, Kyle Xu1, Feng Jiang1, Jade Yang1, Chen Xie2, Ruihai Wu1
1Peking University    2Lightwheel
*Equal contribution. †Corresponding author.
Under Review

Video Presentation

Overview

WorldComposer Teaser

WorldComposer: A novel approach for generative high-fidelity simulation with digital cousins for generalizable robot learning and evaluation.

Abstract

Learning robust robot policies in real-world environments requires diverse data augmentation, yet scaling real-world data collection is costly due to the need for acquiring physical assets and reconfiguring environments. Therefore, augmenting real-world scenes into simulation has become a practical augmentation for efficient learning and evaluation. We present a generative framework that establishes a generative real-to-sim mapping from real-world panoramas to high-fidelity simulation scenes, and further synthesize diverse cousin scenes via semantic and geometric editing. Combined with high-quality physics engines and realistic assets, the generated scenes support interactive manipulation tasks. Additionally, we incorporate multi-room stitching to construct consistent large-scale environments for long-horizon navigation across complex layouts. Experiments demonstrate a strong sim-to-real correlation validating our platform's fidelity, and show that extensively scaling up data generation leads to significantly better generalization to unseen scene and object variations, demonstrating the effectiveness of Digital Cousins for generalizable robot learning and evaluation.

Pipeline

WorldComposer Pipeline

Task Visualization

Simulation Task

Simulation Task

Real-world Task

Real-World Task

Results

Correlation Analysis

Correlation Analysis

Scaling Analysis

Scaling Analysis

Real-World Evaluation Results

Real-World Evaluation Results

BibTeX

@misc{lu2026seeingsimulatinggenerativehighfidelity,
      title={From Seeing to Simulating: Generative High-Fidelity Simulation with Digital Cousins for Generalizable Robot Learning and Evaluation}, 
      author={Jasper Lu and Zhenhao Shen and Yuanfei Wang and Shugao Liu and Shengqiang Xu and Shawn Xie and Jingkai Xu and Feng Jiang and Jade Yang and Chen Xie and Ruihai Wu},
      year={2026},
      eprint={2604.15805},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
      url={https://arxiv.org/abs/2604.15805}, 
}