I am Shuzheng Si (εΈδΉ¦ζ£ in Chinese βπ»), currently a first-year CS Ph.D. student at Tsinghua University. I am lucky to be advised by Prof. Maosong Sun and affiliated with TsinghuaNLP Lab. Previously, I completed my masterβs degree at Peking University and I was very fortunate to be under the supervision of Prof. Baobao Chang at the Institute of Computational Linguistics. I spent my sweet undergraduate days at the School of Software (rank: 1/307), Yunnan University, which is a very beautiful university π.
Now, my research interests lie in Natural Language Processing and Large Language Models, specifically focusing on Data-centric Methods, including Data Selection, Data Synthesis, and Learning from Noisy Data, etc. My long-term research goal is to open the black box of data influence in LLMs and to improve the performance of LLMs using (organized, selected, or synthesized) high-quality data. Find my up-to-date publication list in π Google Scholar.
Feel free to drop an email if you are interested in connecting π§π»βπ€βπ§π».