1

Scaling Speech-Text Pre-training with Synthetic Interleaved Data

Speech language models (SpeechLMs) accept speech input and produce speech output, allowing for more natural human-computer interaction …

Aohan Zeng, Zhengxiao Du, Mingdao Liu, Lei Zhang, Shengmin Jiang, Yuxiao Dong, Jie Tang

Understanding Emergent Abilities of Language Models from the Loss Perspective

Recent studies have put into question the belief that emergent abilities in language models are exclusive to large models. This …

Zhengxiao Du, Aohan Zeng, Yuxiao Dong, Jie Tang

GLM: General Language Model Pretraining with Autoregressive Blank Infilling

There have been various types of pretraining architectures including autoencoding models (e.g., BERT), autoregressive models (e.g., …

Zhengxiao Du, Yujie Qian, Xiao Liu, Ming Ding, Jiezhong Qiu, Zhilin Yang, Jie Tang

P-Tuning: Prompt Tuning Can Be Comparable to Fine-tuning Across Scales and Tasks

Prompt tuning, which only tunes continuous prompts with a frozen language model, substantially reduces per-task storage and memory …

Xiao Liu, Kaixuan Ji, Yicheng Fu, Weng Lam Tam, Zhengxiao Du, Zhilin Yang, Jie Tang

Policy-Gradient Training of Fair and Unbiased Ranking Functions

While implicit feedback (e.g., clicks, dwell times, etc.) is an abundant and attractive source of data for learning to rank, it can …

Himank Yadav, Zhengxiao Du, Thorsten Joachims

Sequential Scenario-Specific Meta Learner for Online Recommendation

Cold-start problems are long-standing challenges for practical recommendations. Most existing recommendation algorithms rely on …

Zhengxiao Du, Xiaowei Wang, Hongxia Yang, Jingren Zhou, Jie Tang

POLAR: Attention-Based CNN for One-Shot Personalized Article Recommendation

In this paper, we propose POLAR, an attention-based CNN combined with one-shot learning for personalized article recommendation. Given …

Zhengxiao Du, Jie Tang, Yuhui Ding