NLP (Question Answering, Clinical NLP, Privacy-preserving NLP)
*: Equal Contributions
-
Synthetic Question Value Estimation for Domain Adaptation of Question Answering
Xiang Yue, Ziyu Yao, Huan Sun
60th Annual Meeting of the Association for Computational Linguistics (ACL 2022 Main Conference)
[arXiv version] [Code]
-
C-MORE: Pretraining to Answer Open-Domain Questions by Consulting Millions of References
Xiang Yue, Xiaoman Pan, Wenlin Yao, Dian Yu, Dong Yu, Jianshu Chen
60th Annual Meeting of the Association for Computational Linguistics (ACL 2022 Main Conference)
[arXiv version] [Code] [Dataset]
-
CliniQG4QA: Generating Diverse Questions for Domain Adaptation of Clinical Question Answering
Xiang Yue*, Frederick Zhang*, Ziyu Yao, Simon Lin, Huan Sun
IEEE Internatinal Conference on Bioinformatics and Biomedicine 2021 (BIBM 2021)
(Best Paper Award) [arXiv version] [Code]
Poster Version in Machine Learning for Health Workshop at NeurIPS 2020
-
COUGH: A Challenge Dataset and Models for COVID-19 FAQ Retrieval
Frederick Zhang, Heming Sun, Xiang Yue, Simon Lin and Huan Sun
The 2021 Conference on Empirical Methods in Natural Language Processing
(EMNLP 2021)
[Dataset] -
Differential Privacy for Text Analytics via Natural Text Sanitization
Xiang Yue*, Minxin Du*, Tianhao Wang, Yaliang Li, Huan Sun and Sherman S. M. Chow
The Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing
(ACL-IJCNLP 2021, Findings, Long Paper)
-
Clinical Reading Comprehension: A Thorough Analysis of the emrQA Dataset
Xiang Yue, Bernal Jimenez Gutierrez and Huan Sun
The 58th Annual Meeting of the Association for Computational Linguistics (ACL 2020)
[arXiv version] [Code] [Slides & Video] -
Clinical Phrase Mining with Language Models
Kaushik Mani*, Xiang Yue*, Bernal Jimenez Gutierrez, Yungui Huang, Simon Lin, and Huan Sun
IEEE Internatinal Conference on Bioinformatics and Biomedicine 2020 (BIBM 2020)
[arXiv extended version] [Code] -
PHICON: Improving Generalization of Clinical Text De-identification Models via Data Augmentation
Xiang Yue and Shuang Zhou
The 3rd Clinical Natural Language Processing Workshop at EMNLP 2020
[arXiv version] [Code] -
Practical Annotation Strategies for Question Answering Datasets
Bernhard Kratzwald, Xiang Yue, Huan Sun and Stefan Feuerriegel
arXiv Preprint
Data Mining (Graph Embedding, Graph Mining, Bioinformatics)
-
Graph Embedding on Biomedical Networks: Methods, Applications, and Evaluations
Xiang Yue, Zhen Wang, Jingong Huang, Srinivasan Parthasarathy, Soheil Moosavinasab, Yungui Huang, Simon M. Lin, Wen Zhang, Ping Zhang and Huan Sun
Bioinformatics (Vol 36 Issue 4, 15 Feb 2020, Page 1241-1251) (Impact Factor: 4.531)
(ESI Highly Cited Paper: top 1% cited paper of its academic field)
[arXiv version] [Code & Datasets] -
SurfCon: Synonym Discovery on Privacy-Aware Clinical Data
Zhen Wang, Xiang Yue, Soheil Moosavinasab, Yungui Huang, Simon Lin and Huan Sun
The 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2019 (KDD 2019, research track, acceptance rate: ~110/~1200=9.2%, oral)
[Code] | [Slides]