Xiang (Tommy) Yue (岳 翔)

PhD student, The Ohio State University, OH, U.S.

Email: yue.149 AT osu DOT edu


Bio

I am currently a final year PhD student working with Prof. Huan Sun in the Department of Computer Science and Engineering at The Ohio State University (OSU). My research interests lie in Natural Language Processing (NLP) with an emphasis on Question Answering and Privacy-preserving NLP. I interned at Microsoft Research (Redmond) in 2022 summer, studying synthetic text generating with privacy guarantee. In summer 2021, I was a research intern at Tencent AI Lab (Seattle), where I worked on open-domain QA pre-training.

I'm looking for full-time positions (I will graduate in 2023)! Feel free to drop me an email if you have openings! CV (Sept. 2022)


What's New

  • [Oct 2022] Check out our new preprint on Synthetic Text Generation with Differential Privacy .
  • [June 2022] Our OSU TacoBot team earned the third-place honor ($50K) in the first Alexa Prize TaskBot Challenge! 10 teams were selected worldwide out of 125 initiated applications to participate in the challenge in May 2021 and 5 teams were selected into finals in April 2022. We are the only US team in the top-3 performers! Check out our report here.
  • [Mar 2022] - Two recent papers about question answering got accepted by ACL 2022 main conference: "Synthetic Question Value Estimation for Domain Adaptation of Question Answering" and "C-MORE: Pretraining to Answer Open-Domain Questions by Consulting Millions of References"
  • [Mar 2022] - I will join Microsoft Research to explore NLP+Privacy for my 2022 summer internship!
  • [Dec 2021] - Our paper "CliniQG4QA: Generating Diverse Questions for Domain Adaptation of Clinical Question Answering" has received the IEEE BIBM 2021 Best Paper Award!
  • [Aug 2021] - Our short paper "COUGH: A Challenge Dataset and Models for COVID-19 FAQ Retrieval" has been accepted to EMNLP 2021 main conference!
  • [May 2021] - Our team has been selected in the Alexa Prize TaskBot Challenge as one of 10 teams over 125 applications initiated from 15 countries! We will build a smart dialogue system to help users finish Cooking and DIY tasks.
  • [May 2021] - Our long paper "Differential Privacy for Text Analytics via Natural Text Sanitization " has been accepted to ACL-IJCNLP 2021, Findings! We propose a privacy-preserving NLP pipeline (which consists of DP-based text sanitization mechanisms, sanitization-aware language model pretraining and finetuning)
  • [Sept 2020] - Our paper "PHICON: Improving Generalization of Clinical Text De-identification Models via Data Augmentation" has been accepted to EMNLP'20 Clinical NLP Workshop!
  • [July 2020] - Attended ACL 2020 and presented our Clinical Reading Comprehension work. Check out our slides and video
  • [April 2020] - Our paper "Clinical Reading Comprehension: A Thorough Analysis of the emrQA Dataset" has been accepted to ACL 2020! We conduct a comprehensive study on the Clinical Reading Comprehension task based on the recently-released emrQA dataset!

Last Updated: 08/2021