I am currently a final year PhD student working with Prof. Huan Sun
in the Department
of Computer Science and Engineering at The Ohio State University (OSU).
I have broad research interests in Natural Language Processing (NLP).
Specifically, my research aims to build safe, responsible and reliable large language models
ensure faithfulness to factual world knowledge and truth
generalize well to various unseen environments
safeguard user data privacy
I also have extensive experience building LLMs for different applications (e.g., Question Answering).
I interned at Microsoft Research (Redmond) in 2022 summer, and at Tencent AI Lab (Bellevue) in 2021
I'm looking for full-time positions (I will graduate in Summer 2023)!
Feel free to drop me an email if you have openings!
- [May 2023] Check out our new preprint on Automatic
Evaluation of Attribution by Large Language Models
- [May 2023] I'm honored to receive two research awards: 2023 CSE Graduate Research
2023 College of Engineering Exemplary Graduate Student Researcher
- [May 2023] Our paper on Synthetic
Text Generation with Differential Privacy got accepted by ACL 2023 main
- [June 2022] Our OSU TacoBot team earned the third-place honor ($50K) in the first Alexa Prize TaskBot Challenge!
10 teams were selected worldwide out of 125 initiated applications to participate in the
challenge in May 2021
and 5 teams were selected into finals in April 2022. We are the only US team in the top-3
Check out our report here.
- [Mar 2022] - Two recent papers about question answering got accepted by ACL 2022 main
conference: "Synthetic Question Value Estimation for
Domain Adaptation of Question Answering" and "C-MORE: Pretraining to Answer Open-Domain
Questions by Consulting Millions of References"
- [Mar 2022] - I will join Microsoft Research to explore NLP+Privacy for my 2022 summer
- [Dec 2021] - Our paper "CliniQG4QA: Generating Diverse
Questions for Domain Adaptation of Clinical Question Answering" has received the IEEE BIBM 2021 Best Paper Award!
- [Aug 2021] - Our short paper "COUGH: A Challenge Dataset and Models for COVID-19 FAQ
has been accepted to EMNLP 2021 main conference!
- [May 2021] - Our team has been selected in the
Alexa Prize TaskBot Challenge as one of 10 teams over 125 applications initiated from 15
We will build a smart dialogue system to help users finish Cooking and DIY tasks.
- [May 2021] - Our long paper "Differential Privacy for Text Analytics via Natural Text
" has been accepted to ACL-IJCNLP 2021, Findings! We propose a privacy-preserving NLP
(which consists of DP-based text sanitization mechanisms, sanitization-aware language model
pretraining and finetuning)
- [Sept 2020] - Our paper "PHICON: Improving Generalization of Clinical Text De-identification
Models via Data Augmentation" has been accepted to EMNLP'20 Clinical NLP Workshop!
- [July 2020] - Attended ACL 2020 and presented our Clinical Reading Comprehension work. Check
slides and video
- [April 2020] - Our paper "Clinical Reading Comprehension: A Thorough Analysis of the emrQA
Dataset" has been accepted to ACL 2020!
We conduct a comprehensive study on the Clinical Reading Comprehension task based on the
recently-released emrQA dataset!
Last Updated: 08/2021