Short Bio

I am a fifth-year Ph.D. student in Prof. Jiawei Han's Data Mining Group, Department of Computer Science, University of Illinois at Urbana-Champaign. Prior to UIUC, I received my bachelor degree from Shanghai Jiao Tong University, under the supervision of Prof. Xinbing Wang, and I was a member of IEEE Honored Class.

My research work lies in the intersection of data mining and natural language processing. Specifically, I focus on constructing multi-faceted taxonomies from unstructured text corpora, and utilizing constructed taxonomies to empower knowledge-centric applications. My proposed data-driven approach, without excessive human annotations, progressively constructs, enriches, and explores multi-faceted taxonomies by utilizing user-provided seed information as weak supervision and existing knowledge repositories as distant supervision.

What's new!

May 2021 - One paper on efficient text encoder pre-training is accepted into the Finding of ACL 2021. Preprint will be released soon.

March 2021 - One paper on Weakly-supervised Hierarchical Multi-Label Text Classification is accepted into NAACL 2021.

Dec. 2020 - One paper on Self-Supervised Taxonomy Completion is accepted into AAAI 2021.

Sept. 2020 - Two papers on Neural Linguistic Steganography and Joint Entity Set Expansion and Synonym Discovery are accepted into EMNLP 2020.

July 2020 - One paper on Multi-Faceted Set Expansion is accepted into ECMLPKDD 2020 and one paper on Personalized Sentiment Analysis is accepted into ICTIR 2020.

May 2020 - One paper on Self-supervised Taxonomy Expansion is accepted into KDD 2020 with its PyTorch implementation in Github.

Apr. 2020 - Collaborated with Yunyi Zhang, our work on Probing Language Model for Entity Set Expansion has been accepted into ACL 2020. Its PyTorch implementation is available in Github.

Jan. 2020 - My work on Self-supervised Taxonomy Expansion has been accepted into WWW 2020. Its PyTorch implementation is available in Github.

Jan. 2020 - Collaborated with Jiaxin Huang, our work on Corpus-based Entity Set Expansion has been accepted into WWW 2020.

Area of Interests

My primary areas of interests in research include:

  • Data Mining
  • Natural Language Processing
  • Information Retrieval
  • Applied Machine Learning

I have also worked on:

  • Interactive Data Visualization
  • Data Wrangling
  • Web Development


Address Room 2119B, Thomas M. Siebel Center,
201 N. Goodwin Avenue,
Urbana, IL 61801 USA.
Email js2[at]