Zhengxiao Du

PhD Student

Tsinghua University

Biography

I am a research scientist at Zhipu AI, co-leading the base model group for GLM. I received a PhD in the Department of Computer Science and Technology at Tsinghua University in 2025, advised by Prof. Jie Tang.

My research interest lies in various aspects of pretrained language models, especially how to improve their general intelligence[ACL'21,ACL'21,NeurIPS'24,ICLR'25]. Generally, I am also interested in the application of machine learning algorithms to real-world systems, including information retrieval [ECML/PKDD'18,TKDE,SIGIR'19], recommender systems[KDD'19] and knowledge graphs [TKDE].

In my spare time, I like to read science fiction novels and history books.

Interests

Data Mining
Recommender System
Knowledge Graph

Education

PhD in Computer Science, 2020 - Now

Tsinghua University
B.E. in Computer Science, 2016 - 2020

Tsinghua University

Recent News

[28/07/25] We release GLM-4.5, the SoTA open-source language model for intelligent agents.

[23/01/25] Our paper Scaling Speech-Text Pre-training with Synthetic Interleaved Data is accepted at ICLR 2025.

[25/10/24] We release GLM-4-Voice, an end-to-end speech chat model that supports both English and Chinese.

[26/09/24] Our paper Understanding Emergent Abilities of Language Models from the Loss Perspective is accepted at NeurIPS 2024.

[05/06/24] We release GLM-4-9B, with superior performance beyond Llama-3-8B and long context length up to 128K context.

Publications

Aohan Zeng, Zhengxiao Du, Mingdao Liu, Kedong Wang, Shengmin Jiang, Lei Zhao, Yuxiao Dong, Jie Tang (2024). GLM-4-Voice: Towards Intelligent and Human-Like End-to-End Spoken Chatbot. Preprint.

Preprint

Aohan Zeng, Zhengxiao Du, Mingdao Liu, Lei Zhang, Shengmin Jiang, Yuxiao Dong, Jie Tang (2024). Scaling Speech-Text Pre-training with Synthetic Interleaved Data. ICLR 2025.

Preprint

Zhengxiao Du, Aohan Zeng, Yuxiao Dong, Jie Tang (2024). Understanding Emergent Abilities of Language Models from the Loss Perspective. NeurIPS 2024.

Preprint

Aohan Zeng, Xiao Liu, Zhengxiao Du, Zihan Wang, Hanyu Lai, Ming Ding, Zhuoyi Yang, Yifan Xu, Wendi Zheng, Xiao Xia, Weng Lam Tam, Zixuan Ma, Yufei Xue, Jidong Zhai, Wenguang Chen, Peng Zhang, Yuxiao Dong, Jie Tang (2022). GLM-130B: An Open Bilingual Pre-Trained Model. ICLR 2023.

Code Preprint

Zhengxiao Du, Yujie Qian, Xiao Liu, Ming Ding, Jiezhong Qiu, Zhilin Yang, Jie Tang (2022). GLM: General Language Model Pretraining with Autoregressive Blank Infilling. the 60th Annual Meeting of the Association for Computational Linguistics, ACL 2022.

Code Preprint Paper

See all publications

Experience

Tech Lead

ZhipuAI

Jan 2023 – Present Beijing

Co-leading the pre-training team of ChatGLM

Research Intern

Beijing Academy of Artificial Intelligence

Sep 2020 – Mar 2022 Beijing

Advised by Professor Jie Tang

Research Intern

DAMO Academy, Alibaba Group

Jun 2020 – Sep 2020 Beijing

Advised by Hongxia Yang

Research Intern

Cornell University

Jun 2019 – Oct 2019 Ithaca, NY

Advised by Professor Thorsten Joachims

Research Assistant

Knowledge Engineering Group, Tsinghua University

Jun 2017 – Jun 2019 Beijing

Advised by Professor Jie Tang

Honors

Tsinghua Excellent (Bachelor) Graduate

Tsinghua University Jun 2020

Top 2% of all the graduates

The Cai Xiong Scholarship

Tsinghua University Oct 2019

Top 1%, awarded for outstanding research experience

Elite Collegiate Award

China Computer Federation Oct 2019

73 winners nationwide, only 4 in Tsinghua University

Contact

zx-du20 [at] mails [dot] tsinghua.edu.cn
Github Account