University of Illinois at Urbana-Champaign

Ph.D. Candidate in Computer Science • Aug. 2016 — May. 2021 (expected)

  • Member of Data Mining Group. Supervised by Professor Jiawei Han.
  • Received Brian Totty Graduate Fellowship at Computer Science Department.

Shanghai Jiao Tong University ( SJTU )

B.S.E. in Computer Science and Technology • Sep. 2012 — Jun. 2016

  • Overall GPA: 3.92/4.0 (91.71/100)     Major GPA: 3.98/4.0 (93.78/100)     Rank: 1/78
  • Member of IEEE honor class, an elite program at SJTU which aims to nurture scientists in computer science, electrical and electronic technology, and information science based on MIT’s educational model.

Yale University

International Exchange Student • Jun. 2014 — Aug. 2014

  • One of 10 top students fully funded by notable alumnus Neil Shen.
  • Studied in the Intensive English Program at English Language Institute.
  • Earned “Certificate of Excellence” at Yale University.



  • Jiaming Shen, Maryam Karimzadehgan, Michael Bendersky, Zhen Qin, and Donald Metzler. Multi-Task Learning for Personal Search Ranking with Auxiliary Query Clustering, accepted into The 27th ACM International Conference on Information and Knowledge Management (CIKM 2018).
  • Jiaming Shen, Zeqiu Wu, Dongming Lei, Chao Zhang, Xiang Ren, Michelle T. Vanni, Brian M. Sadler, and Jiawei Han. HiExpan: Task-Guided Taxonomy Construction by Hierarchical Tree Expansion, accepted into The 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2018).
  • Jiaming Shen, Jinfeng Xiao, Xinwei He, Jingbo Shang, Saurabh Sinha, and Jiawei Han. "Entity Set Search of Scientific Literature: An Unsupervised Ranking Approach", accepted into The 41st International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018). [PDF] [Code]
  • Jiaming Shen, Jinfeng Xiao, Yu Zhang, Carl Yang, Jingbo Shang, Jinda Han, Saurabh Sinha, Peipei Ping, Richard Weinshilboum, Zhiyong Lu and Jiawei Han, "SetSearch+: Entity-Set-Aware Search and Mining for Scientific Literature", accepted into The 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2018 Demo Track).
  • Yu Meng, Jiaming Shen, Chao Zhang, and Jiawei Han. Weakly-Supervised Neural Text Classification, accepted into The 27th ACM International Conference on Information and Knowledge Management (CIKM 2018).
  • Jingbo Shang, Jiaming Shen, Tianhang Sun, Xingbang Liu, Anja Gruenheid, Flip Korn, Adam Lelkes, Cong Yu, and Jiawei Han. Investigating Rumor News Using Agreement Aware Search, accepted into The 27th ACM International Conference on Information and Knowledge Management (CIKM 2018).
  • Hanwen Zha, Jiaming Shen, Keqian Li, Warren Greiff, Michelle Vanni, Jiawei Han and Xifeng Yan, "FTS: Faceted Taxonomy Construction and Search for Scientific Publications", accepted into The 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2018 Demo Track).
  • Jingbo Shang, Qi Zhu, Jiaming Shen, Xuan Wang, Xiaotao Gu, Lance Kaplan, Timothy Harratty and Jiawei Han, "AutoNet: Automated Network Construction and Exploration System from Domain-Specific Corpora", accepted into The 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2018 Demo Track).
  • Yuning Mao, Xiang Ren, Jiaming Shen, and Jiawei Han. End-to-End Reinforcement Learning for Automatic Taxonomy Induction, accepted into The 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018). [PDF] [Code]
  • Jingbo Shang, Chao Zhao, Jiaming Shen, and Jiawei Han. "Towards Multidimensional Analysis of Text Corpora", accepted into The 24th ACM SIGKDD International Conference on Knowledge (KDD 2018).
  • Chao Zhang, Fangbo Tao, Xiusi Chen, Jiaming Shen, Meng Jiang, Brian M. Sadler, Michelle T. Vanni, and Jiawei Han. TaxoGen: Constructing Topical Concept Taxonomy by Adaptive Term Embedding and Clustering, accepted into The 24th ACM SIGKDD International Conference on Knowledge (KDD 2018).
  • David A. Liem, Sanjana Murali, Dibakar Sigdel, Yu Shi, Xuan Wang, Jiaming Shen, Howard Choi, J Harry Caufield, Wei Wang, Peipei Ping, and Jiawei Han. "Phrase Mining of Textual Data to Analyze Extracellular Matrix Protein Patterns Across Cardiovascular Disease", accepted into American Journal of Physiology-Heart and Circulatory Physiology.


  • Jiaming Shen, Zeqiu Wu, Dongming Lei, Jingbo Shang, Xiang Ren, and Jiawei Han, "SetExpan: Corpus-Based Set Expansion via Context Feature Selection and Rank Ensemble", accepted into The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD 2017). [PDF]
  • Xiang Ren, Jiaming Shen, Meng Qu, Xuan Wang, Zeqiu Wu, Qi Zhu, Meng Jiang, Fangbo Tao, Saurabh Sinha, David Liem, Peipei Ping, Richard Weinshilboum, and Jiawei Han, "Life-iNet: A Structured Network-Based Knowledge Exploration and Analytics System for Life Sciences", accepted into The 55th annual meeting of the Association for Computational Linguistics (ACL 2017 System Demo). [PDF]


  • Junxian He, Yin Huang, Changfeng Liu, Jiaming Shen, Yuting Jia, and Xinbing Wang, "Text network exploration via heterogeneous web of topics", accepted into the Sixth IEEE ICDM Workshop on Data Mining in Networks (ICDM 2016 Workshop). [PDF]


  • Jiaming Shen, Zhenyu Song, Shitao Li, Zhaowei Tan, Yuning Mao, Luoyi Fu, Li Song, and Xinbing Wang, "Modeling Topic-level Academic Influence in Scientific Literatures", accepted into Thirtieth AAAI Conference on Artificial Intelligence (AAAI 2016) Workshop on Scholarly Big Data. [PDF] [Slide]
  • Jiaming Shen, Zhaowei Tan, Luoyi Fu, and Xinbing Wang, "Trend Analysis of Top-tier Conferences in Computer Network Field", accepted into Communications of the China Computer Federation, 11(9), 62-66, 2015. [PDF]
  • Zhaowei Tan, Changfeng Liu, Yuning Mao, Jiaming Shen, Bin Wang, Luoyi Fu, Li Song, and Xinbing Wang, "AceMap: A Novel Approach towards Displaying Relationship among Academic Literatures", accepted to 25th International World Wide Web Conference (WWW 2016). [PDF] [System]


Multi-Task Learning for Personal Search with Query Clustering

Google Intern Work • May 2017 — Aug. 2017

  • Developed a hierarchical clustering algorithm based on truncated SVD and varimax rotation for query clustering based on query/document attributes in personal search.
  • Proposed a query-dependent deep neural ranking model based on the multi-task learning framework.
  • Improved offline Gmail ranking quality by 0.8% in terms of MRR and 1.32% in terms of success@1.

Entity Set Aware Information Retrieval System

Supervised by Prof. Jiawei Han • Aug. 2017 — Present

  • Propose an unsupervised ranking algorithm based on entity language model for biomedical literature search.
  • Parse over 100GB PubTator and PubMed datasets, and build a real-time system based on ElasticSearch.

Overlapping Community Detection in Text Network

Supervised by Prof. Xinbing Wang • Feb, 2016 — Jul, 2016

  • Generated 32 large text networks based on Microsoft Academic Graph and studied their community structures.
  • Proposed an affiliation graph model to capture community interactions and consider information from both link structures and node attributes.
  • Achieved 40% improvements in terms of the accuracy of detected communities on 17 real networks.

Topic-based Academic Information Retrieval System

Supervised by Prof. Xinbing Wang • Sep, 2015 — Jun. 2016

  • Managed a group of 7 people developing a topic-based search engine. Refactored and configured an open source enterprise search platform Solr.
  • Returned paper search results based on both word-level and topic-level similarities with user’s query.
  • Ranked papers according to their influence scores as well as their relevance to the query.
  • Visualized the topic distribution of each paper and topic evolution among the whole corpus.

Modeling Academic Influence in Scientific Literatures

Supervised by Prof. Xinbing Wang • Apr, 2015 — Sep, 2015

  • Devised a generative model named Reference Topic Model (RefTM) to utilize both the textual content and citation information in scientific literatures.
  • Proposed a fast inference algorithm based on collapsed Gibbs Sampling to learn RefTM effectively.
  • Introduced a quantitative metric named J-Index to model academic influence in scientific literatures.
  • Designed experiments on a collection of over 420,000 research papers to validate the effectiveness of J-Index.

Exponential Interest Aggregation in Named Data Networking

Supervised by Prof. Weijia Jia • Apr, 2015 — Jul, 2015

  • Proposed Exponential Interest Aggregation (EIA), an adaptive forwarding strategy addressing hop-by-hop congestion control problem in Named Data Networking (NDN).
  • Established the Interest aggregation state transition framework in NDN, and analyzed the effectiveness of EIA algorithm mathematically under this framework.
  • Conducted simulation to evaluate the performance of EIA, and showed that EIA improved average delay by 13%, average number of retransmission by 25%, and cache hit ratio by 61%


Accent Classification of English Speakers

This work was advised by Prof. Kai Yu.

  • Solved the accent classification problem through four steps -- word segmentation, feature extraction, clip classification and recording classification.
  • Built a system with Mel-Frequency Cepstral Coefficients (MFCC) as the feature vector and Logistic Regression as the classification method, and achieved an overall classification accuracy of 97%.

PM2.5 Concentration Prediction using Time Series based Data Mining

This work was advised by Prof. Bo Yuan.

  • Formalized PM2.5 prediction problem, an important issue in the control and reduction of pollutants in the air.
  • Applied three methods to deal with PM2.5 prediction problem, including AutoRegressive-Moving-Average (ARMA) model, Stochastic Volatility (SV) model, and Stock-Watson (SW) model.
  • Designed a novel model combining Stock-Watson model with Time Series Neural Network to achieve a better prediction accuracy for PM2.5 concentration in the next 6 hours.

A Map-Generating and Speed-Optimizing Driving System

This work was advised by Prof. Xinbing Wang.

  • Designed and implemented a traffic signal schedule inference model to estimate the duration of traffic light.
  • Integrated this model in a speed-optimizing driving system named CityDrive, which reduced the total waiting time per vehicle by 98.8% in the simulation and saved 58.8% of kinetic energy in real tests.

Static Website Construction

  • Designed and constructed IEEE Honor Class home page, using Bootstrap 3, plain HTML/CSS, and JavaScript.
  • Developed Shanghai Jiao Tong University president home page (Chinese Version), based on Joomla! CMS.
  • Created several static personal home pages and static Blogs, using GitHub Pages combined with Jekyll.

Selected Awards

  • Brian Totty Graduate Fellowship • 2016 -- 2017
  • Tang lixin Scholarship (Top 1%) • 2016 -- 2021
  • Chuntsung Scholarship (Top 1%) • 2015 -- 2016
  • National Scholarship (Top 1%) • 2013 -- 2014 & 2014 -- 2015
  • Academic Excellence Scholarship (Type A) of SJTU (Top 1%) • 2013 -- 2014 & 2014 -- 2015
  • Arawana Scholarship (Top 5%) • 2012-2013
  • “Merit Student” of SJTU • 2012 -- 2013 & 2013 -- 2014
  • Honorable Mention in 2015 Interdisciplinary Contest In Modeling • 2015
  • Second Prize in China Undergraduate Mathematical Contest in Modeling (Shanghai Division) • 2014
  • Third Prize in National Undergraduate Physics Competition (Shanghai Division) • 2013

Selected Extra-curricular Activities

  • Leader of IEEE Honor Class • 2012 -- 2016
  • Personal Tutor of Mathematical Analysis Course • 2012 -- 2013
  • Teaching Assistant of Programming and Data Structures Course • 2013 -- 2014
  • Volunteer for Shanghai International Marathon • 2013 -- 2014
  • Member of the Student Union of the School of Electronic Information and Electrical Engineering • 2013 -- 2015



  • TOEFL 109 ( Reading:30, Listening:29, Speaking:26, Writing:24 )
  • GRE 328 ( Verbal:158, Quantitative:170, Analytical Writing:3.5 )
  • Passed Advanced-Level English Interpretation Accreditation Examination

Computer Languages

  • Python, R, Mathematica, C++, MATLAB, HTML5&CSS3, JavaScript


  • Git, Latex, Vim, Linux, Keynote, MS Offices, OmniGraffle