My Research

The goal of my research is to unleash hidden knowledge buried in unstructured text data and enable them to be more accessible, interpretable, and reusable. Towards this goal, I propose a novel and generic data model, named multi-faceted taxonomy, which organizes concepts of different facets into hierarchical structures. My research consolidates the power of multi-faceted taxonomy in three areas of investigation:

  • Construction: To identify important concepts and their taxonomic relations from text corpora, I propose a series of set expansion and topic/word hierarchy construction methods.
  • Enrichment: To keep existing taxonomies up-to-date in real-world applications, I study multiple tasks including synonym (set) discovery, taxonomy expansion, and taxonomy completion.
  • Application: To distill knowledge from multi-faceted taxonomies for downstream applications, I develop methods for weakly-supervised text classification and unsupervised literature search.

Education

University of Illinois at Urbana-Champaign ( UIUC )

Ph.D. Candidate in Computer Science • Aug. 2016 — Aug. 2021 (expected)

  • Overall GPA: 4.0/4.0
  • Member of Data Mining Group. Supervised by Professor Jiawei Han.
  • Received Brian Totty Graduate Fellowship at Computer Science Department.
  • Received Yunni & Maxine Pao Memorial Fellowship.

Shanghai Jiao Tong University ( SJTU )

B.S.E. in Computer Science and Technology • Sep. 2012 — Jun. 2016

  • Overall GPA: 3.92/4.0 (91.71/100)     Major GPA: 3.98/4.0 (93.78/100)     Rank: 1/78
  • Member of IEEE honor class, an elite program at SJTU which aims to nurture scientists in computer science, electrical and electronic technology, and information science based on MIT’s educational model.

Yale University

International Exchange Student • Jun. 2014 — Aug. 2014

  • One of 10 top students fully funded by notable alumnus Neil Shen.
  • Studied in the Intensive English Program at English Language Institute.
  • Earned “Certificate of Excellence” at Yale University.

Industry Experience

Google Research

Research Intern • May 2020 — Dec. 2020

  • Developed a new text encoder pre-training strategy using multiple task-specific attention heads.

Microsoft Research

Research Intern • May 2019 — Aug. 2019

  • Developed a self-supervised taxonomy expansion method with position-aware graph neural networks.

LinkedIn

External Researcher • Sep. 2018 — Oct. 2019

  • Proposed a hypernymy discovery method to exploit the context granularity in a text-rich heterogeneous information network.

Google

Research SWE Intern • May 2017 — Aug. 2017

  • Developed a multi-task learning method for email search ranking with auxiliary query clustering.

Teaching Experience

UIUC CS512: Data Mining Principle

Teaching Assistant • Spring 2020

UIUC CS412: Introduction to Data Mining

Guest Lecturer • Spring 2019

SJTU CS119: Principles of Computer Algorithms

Teaching Assistant • Summer 2015

SJTU CS114: Programming and Data Structures

Teaching Assistant • Spring 2015

SJTU MA118: Mathematical Analysis I

Teaching Assistant • Fall 2013

Services

  • Conference Program Committee: NAACL-HLT 2021, AAAI 2021, ACL 2020, EMNLP 2020, AAAI 2020, ECMLPKDD 2020, EMNLP 2019, ICLR LLD Workshop 2019.
  • Journal Reviewer: IEEE Transactions of Knowledge and Data Engineering (TKDE).
  • Conference External Reviewer: ACL 2021, WebConf 2021, KDD 2020, WebConf 2020, KDD 2019, WSDM 2018, KDD 2018, SIGIR 2017, EMNLP 2017.

Skills

  • Programming Languages: Python, C++, MATLAB, R, Mathematica
  • Deep Learning Platforms: PyTorch, TensorFlow 2
  • Tools: Git, Latex, Vim, Linux, Keynote, MS Offices, OmniGraffle