Hao Wu

Research Intern @ Tencent, @ Department of Computer Science, University of Science and Technology of China @ Tsinghua University

I am currently a Research Intern at Tencent. I graduated from the Department of Computer Science at the University of Science and Technology of China (USTC). During my master's studies, I was also a joint training student in the large model training group of the Machine Learning Platform Department at Tencent.

My research lies at the intersection of Robot video / world models, multimodal large language models, and agent systems. More broadly, I am interested in building intelligent systems that can understand, predict, and reason about the physical world across modalities.

My recent and future focus includes three closely related directions: agentic reasoning, multimodal large models, robot video generation and embodied world models, and video world models. My work has appeared in venues such as ICLR, NeurIPS, ICML, KDD, AAAI, ICCV, ACM MM, TKDE, and TPAMI, with nearly 30 CCF-A publications.

Current Research Interests

Video World Models Studying robot video generation, reward-aligned training, and long-horizon embodied prediction.

Agent Systems Building agentic systems for real-world reasoning with retrieval, planning, and tool use.

Multimodal LLMs Exploring multimodal understanding, reasoning, alignment, and safety-aware learning.

News

2026.05
As corresponding author, our paper "Evolving Multimodal Models for Physical Dynamics: A Multi-objective Neuroevolution Approach" was accepted by IEEE Transactions on Evolutionary Computation (TEVC), a CAS JCR Tier-1 (双一区) journal. Congrats to all collaborators!

2026.05
Joined Tencent IEG, Lightspeed Studios as a Qingyun Research Intern, working on video generation world models.

2026.05
As first author, two papers were accepted to KDD 2026 Research Track. Congrats to all collaborators!

2026.05
As first author, a paper on large-scale spatio-temporal pre-training and post-training is accepted by ICML 2026. Congratulations to all!

2026.04
A paper on the topic of Inference-time Safety Alignment was accepted by ACL 2026.

2026.01
Two papers were accepted to TPAMI. Congrats to all collaborators!

2025.12
Two papers were accepted to AAAI and ICLR. Congrats to all collaborators!

2025.06
As corresponding author, one paper was accepted to ICCV. Congrats to all collaborators!

2025.05
As co-first author, one paper was accepted to ICML. Congrats to all collaborators!

2025.03
As first author, one paper was accepted to KDD. Congrats to all collaborators!

2024.12
As corresponding author, one paper was accepted to ICLR. Congrats to all collaborators!

2024.07
As first author or co-first author, three papers were accepted to NeurIPS. Congrats to all collaborators!

2024.05
As first author, one paper was accepted to KDD. Congrats to all collaborators!

2024.05
As first author, one paper was accepted to ACM MM. Congrats to all collaborators!

2024.05
As first author, one paper was accepted to ICML. Congrats to all collaborators!

2023.12
As first author, two papers were accepted to AAAI. Congrats to all collaborators!

2023.07
As co-first author, one paper was accepted to NeurIPS. Congrats to all collaborators!

Selected Awards & Honors

2025

Outstanding Graduate

USTC × Tencent Joint Training Program

2022

National Scholarship Top 1% in China

University of Science and Technology of China

2022

First-Class Scholarship

University of Science and Technology of China

Experience

Research Intern, Tencent IEG
May 2026 - Present

Lightspeed Studios

Researching video generation world models, with a focus on long-horizon video generation, controllable dynamics, and physically grounded simulation for interactive scenarios.

Research Intern, Tencent CSIG
Aug. 2025 - May. 2026

Tencent Jarvis Lab

Continuing research on multimodal foundation models, agent systems, and world models in industrial-scale settings.

Research Intern, Tencent Hunyuan
Aug. 2023 - Jul. 2025

Machine Learning Platform Department

Worked on large models, world models, and multimodal generative modeling in Tencent Hunyuan.

Online Research Intern, UCLA
May 2023 - May 2024

Advisor: Xiao Luo

Conducted remote research on multimodal learning, dynamics modeling, and related machine learning problems.

Research Intern, HKUST (Guangzhou)
Mar. 2023 - Aug. 2023

Advisors: Yuxuan Liang and Kun Wang

Worked on machine learning and multimodal modeling in the CityMind research environment.

Selected Publications
Full list on Google Scholar

RoboAlign-R1: Distilled Multimodal Reward Alignment for Robot Video World Models

Hao Wu, Yuqi Li, Yuan Gao, Fan Xu, Fan Zhang, Kun Wang, Penghao Zhao, Qiufeng Wang, Yizhou Zhao, Weiyan Wang, Yingli Tian, Xian Wu, Xiaomeng Huang. First Author

Under Review

Project Page PDF Code BibTeX

Safety-Aware Rollouts with Self-Reflection and Structured Rewards

Huahui Yi, Kun Wang, Haolong Hu, Moayad Aloqaily, Liang Lin, Junhao Dong, Qiankun Li, Xing Fan, Hao Wu, Yang Liu, and Qingsong Wen Corresponding Author

TPAMIUnder Review

PDF BibTeX

Frequency-Aligned Knowledge Distillation for Lightweight Spatiotemporal Forecasting

Yuqi Li, Chuanguang Yang, Hansheng Zeng, Zeyu Dong, Zhulin An, Yongjun Xu, Yingli Tian, and Hao Wu Corresponding Author

ICCV 2025

PDF Code BibTeX

Service

Reviewer: ICLR, KDD, NeurIPS, ICCV, AAAI, TKDE, ICML, and ACM MM

Research Areas: Multimodal large models, video world models, and agent systems