Bio
I am currently a final year PhD student working with Prof. Huan Sun
in the Department
of Computer Science and Engineering at The Ohio State University (OSU). My research interests
lie in Natural Language Processing (NLP)
with an emphasis on Question Answering and Privacy-preserving NLP.
I interned at Microsoft Research (Redmond) in 2022 summer, studying synthetic text generating with
privacy guarantee.
In summer 2021, I was a research intern at Tencent AI Lab (Seattle), where I worked on open-domain
QA pre-training.
I'm looking for full-time positions (I will graduate in 2023)!
Feel free to drop me an email if you have openings! CV (Sept. 2022)
What's New
- [Oct 2022] Check out our new preprint on Synthetic
Text Generation with Differential Privacy
.
- [June 2022] Our OSU TacoBot team earned the third-place honor ($50K) in the first Alexa Prize TaskBot Challenge!
10 teams were selected worldwide out of 125 initiated applications to participate in the
challenge in May 2021
and 5 teams were selected into finals in April 2022. We are the only US team in the top-3
performers!
Check out our report here.
- [Mar 2022] - Two recent papers about question answering got accepted by ACL 2022 main
conference: "Synthetic Question Value Estimation for
Domain Adaptation of Question Answering" and "C-MORE: Pretraining to Answer Open-Domain
Questions by Consulting Millions of References"
- [Mar 2022] - I will join Microsoft Research to explore NLP+Privacy for my 2022 summer
internship!
- [Dec 2021] - Our paper "CliniQG4QA: Generating Diverse
Questions for Domain Adaptation of Clinical Question Answering" has received the IEEE BIBM 2021 Best Paper Award!
- [Aug 2021] - Our short paper "COUGH: A Challenge Dataset and Models for COVID-19 FAQ
Retrieval"
has been accepted to EMNLP 2021 main conference!
- [May 2021] - Our team has been selected in the
Alexa Prize TaskBot Challenge as one of 10 teams over 125 applications initiated from 15
countries!
We will build a smart dialogue system to help users finish Cooking and DIY tasks.
- [May 2021] - Our long paper "Differential Privacy for Text Analytics via Natural Text
Sanitization
" has been accepted to ACL-IJCNLP 2021, Findings! We propose a privacy-preserving NLP
pipeline
(which consists of DP-based text sanitization mechanisms, sanitization-aware language model
pretraining and finetuning)
- [Sept 2020] - Our paper "PHICON: Improving Generalization of Clinical Text De-identification
Models via Data Augmentation" has been accepted to EMNLP'20 Clinical NLP Workshop!
- [July 2020] - Attended ACL 2020 and presented our Clinical Reading Comprehension work. Check
out
our
slides and video
- [April 2020] - Our paper "Clinical Reading Comprehension: A Thorough Analysis of the emrQA
Dataset" has been accepted to ACL 2020!
We conduct a comprehensive study on the Clinical Reading Comprehension task based on the
recently-released emrQA dataset!
Last Updated: 08/2021