Bio
I am currently a final year PhD student working with Prof. Huan Sun
in the Department
of Computer Science and Engineering at The Ohio State University (OSU).
I have broad research interests in Natural Language Processing (NLP).
Specifically, my research aims to build safe, responsible and reliable large language models
(LLMs),
which
-
ensure faithfulness to factual world knowledge and truth
-
generalize well to various unseen environments
-
safeguard user data privacy
I also have extensive experience building LLMs for different applications (e.g., Question Answering).
I interned at Microsoft Research (Redmond) in 2022 summer, and at Tencent AI Lab (Bellevue) in 2021
summer.
I'm looking for full-time positions (I will graduate in Summer 2023)!
Feel free to drop me an email if you have openings!
What's New
- [May 2023] Check out our new preprint on Automatic
Evaluation of Attribution by Large Language Models
- [May 2023] I'm honored to receive two research awards: 2023 CSE Graduate Research
Award and
2023 College of Engineering Exemplary Graduate Student Researcher
- [May 2023] Our paper on Synthetic
Text Generation with Differential Privacy got accepted by ACL 2023 main
conference
- [June 2022] Our OSU TacoBot team earned the third-place honor ($50K) in the first Alexa Prize TaskBot Challenge!
10 teams were selected worldwide out of 125 initiated applications to participate in the
challenge in May 2021
and 5 teams were selected into finals in April 2022. We are the only US team in the top-3
performers!
Check out our report here.
- [Mar 2022] - Two recent papers about question answering got accepted by ACL 2022 main
conference: "Synthetic Question Value Estimation for
Domain Adaptation of Question Answering" and "C-MORE: Pretraining to Answer Open-Domain
Questions by Consulting Millions of References"
- [Mar 2022] - I will join Microsoft Research to explore NLP+Privacy for my 2022 summer
internship!
- [Dec 2021] - Our paper "CliniQG4QA: Generating Diverse
Questions for Domain Adaptation of Clinical Question Answering" has received the IEEE BIBM 2021 Best Paper Award!
- [Aug 2021] - Our short paper "COUGH: A Challenge Dataset and Models for COVID-19 FAQ
Retrieval"
has been accepted to EMNLP 2021 main conference!
- [May 2021] - Our team has been selected in the
Alexa Prize TaskBot Challenge as one of 10 teams over 125 applications initiated from 15
countries!
We will build a smart dialogue system to help users finish Cooking and DIY tasks.
- [May 2021] - Our long paper "Differential Privacy for Text Analytics via Natural Text
Sanitization
" has been accepted to ACL-IJCNLP 2021, Findings! We propose a privacy-preserving NLP
pipeline
(which consists of DP-based text sanitization mechanisms, sanitization-aware language model
pretraining and finetuning)
- [Sept 2020] - Our paper "PHICON: Improving Generalization of Clinical Text De-identification
Models via Data Augmentation" has been accepted to EMNLP'20 Clinical NLP Workshop!
- [July 2020] - Attended ACL 2020 and presented our Clinical Reading Comprehension work. Check
out
our
slides and video
- [April 2020] - Our paper "Clinical Reading Comprehension: A Thorough Analysis of the emrQA
Dataset" has been accepted to ACL 2020!
We conduct a comprehensive study on the Clinical Reading Comprehension task based on the
recently-released emrQA dataset!
Last Updated: 08/2021