Hi there! I am a founding engineer of RadixArk, a CS PhD student on leave, a proud member of lmsys.org, and a modest learner of RL systems. I live near Los Angeles with my girlfriend and two cats.
I feel honored to be part of SGLang team and coordinating our RL Group to facilitate large-scale RL. We are always welcoming self-motivated engineers and researchers to join us, and try our best to refer new opportunities.
I spent one year at UCLA, fortunately advised by Prof. Quanquan Gu, Dr. Ying Sheng and Dr. Lianmin Zheng.
I obtained my Bachelor's degree at Tsinghua University. At that time, I was supervised by Prof. Graham Neubig and Prof. Tongshuang Wu and mentored by Vijay Viswanathan at CMU. Their research guidance and tastes are invaluable to me.
Post-training Pipelines
Machine Learning Systems
RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments
SwingArena: Competitive Programming Arena for Long-context GitHub Issue Solving
Soft Thinking: Unlocking the Reasoning Potential of LLMs in Continuous Concept Space
MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies
SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning
Prompt2Model: Generating Deployable Models from Natural Language Instructions
Steel-string Acoustic Guitar 民谣吉他
Swimming, Running and Hiking
Cooking Szechwan Cuisine 川菜