Education

University of Illinois at Urbana-Champaign

Ph.D. Candidate in Computer Science • Aug. 2016 — May. 2021 (expected)

  • Member of Data Mining Group. Supervised by Professor Jiawei Han.
  • Received Brian Totty Graduate Fellowship at Computer Science Department.

Shanghai Jiao Tong University ( SJTU )

B.S.E. in Computer Science and Technology • Sep. 2012 — Jun. 2016

  • Overall GPA: 3.92/4.0 (91.71/100)     Major GPA: 3.98/4.0 (93.78/100)     Rank: 1/78
  • Member of IEEE honor class, an elite program at SJTU which aims to nurture scientists in computer science, electrical and electronic technology, and information science based on MIT’s educational model.

Yale University

International Exchange Student • Jun. 2014 — Aug. 2014

  • One of 10 top students fully funded by notable alumnus Neil Shen.
  • Studied in the Intensive English Program at English Language Institute.
  • Earned “Certificate of Excellence” at Yale University.

Publications

2017

  • J. Shen, Z. Wu, D. Lei, J. Shang, X. Ren, J. Han, "SetExpan: Corpus-Based Set Expansion via Context Feature Selection and Rank Ensemble", accepted into The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2017). [PDF]
  • X. Ren, J. Shen, M. Qu, X. Wang, Z. Wu, Q. Zhu, M. Jiang, F. Tao, S. Sinha, D. Liem, P. Ping, R. Weinshilboum, J. Han, "Life-iNet: A Structured Network-Based Knowledge Exploration and Analytics System for Life Sciences", accepted into The 55th annual meeting of the Association for Computational Linguistics (ACL 2017) System Demo. [PDF]

2016

  • J. Shen, Y. Jia, X. Liu, Y. Huang, L. Fu, X. Wang, "Overlapping Community Detection in Temporal Text Networks", submitted into Tenth ACM International Conference on Web Search and Data Mining (WSDM 2017). [PDF]
  • J. He, Y. Huang, C. Liu, J. Shen, Y. Jia, X. Wang, "Text network exploration via heterogeneous web of topics", accepted into the Sixth IEEE ICDM Workshop on Data Mining in Networks (ICDM 2016). [PDF] [Demo]

2015

  • J. Shen, Z. Song, S. Li, Z. Tan, Y. Mao, L. Fu, L. Song, X. Wang, "Modeling Topic-level Academic Influence in Scientific Literatures", accepted into Thirtieth AAAI Conference on Artificial Intelligence (AAAI 2016) Workshop on Scholarly Big Data. [PDF] [Slide]
  • J. Shen, Z. Tan, L. Fu, X. Wang, "Trend Analysis of Top-tier Conferences in Computer Network Field", accepted into Communications of the China Computer Federation, 11(9), 62-66, 2015. [PDF]
  • Z. Tan, C. Liu, Y. Mao, J. Shen, B. Wang, L. Fu, L. Song, X. Wang, "AceMap: A Novel Approach towards Displaying Relationship among Academic Literatures", accepted to 25th International World Wide Web Conference (WWW 2016). [PDF] [System]

Research

Multi-Task Learning for Personal Search with Query Clustering

Google Intern Work • May 2017 — Aug. 2017

  • Developed a hierarchical clustering algorithm based on truncated SVD and varimax rotation for query clustering based on query/document attributes in personal search.
  • Proposed a query-dependent deep neural ranking model based on the multi-task learning framework.
  • Improved offline Gmail ranking quality by 0.8% in terms of MRR and 1.32% in terms of success@1.

Entity Set Aware Information Retrieval System

Supervised by Prof. Jiawei Han • Aug. 2017 — Present

  • Propose an unsupervised ranking algorithm based on entity language model for biomedical literature search.
  • Parse over 100GB PubTator and PubMed datasets, and build a real-time system based on ElasticSearch.

Overlapping Community Detection in Text Network

Supervised by Prof. Xinbing Wang • Feb, 2016 — Jul, 2016

  • Generated 32 large text networks based on Microsoft Academic Graph and studied their community structures.
  • Proposed an affiliation graph model to capture community interactions and consider information from both link structures and node attributes.
  • Achieved 40% improvements in terms of the accuracy of detected communities on 17 real networks.

Topic-based Academic Information Retrieval System

Supervised by Prof. Xinbing Wang • Sep, 2015 — Jun. 2016

  • Managed a group of 7 people developing a topic-based search engine. Refactored and configured an open source enterprise search platform Solr.
  • Returned paper search results based on both word-level and topic-level similarities with user’s query.
  • Ranked papers according to their influence scores as well as their relevance to the query.
  • Visualized the topic distribution of each paper and topic evolution among the whole corpus.

Modeling Academic Influence in Scientific Literatures

Supervised by Prof. Xinbing Wang • Apr, 2015 — Sep, 2015

  • Devised a generative model named Reference Topic Model (RefTM) to utilize both the textual content and citation information in scientific literatures.
  • Proposed a fast inference algorithm based on collapsed Gibbs Sampling to learn RefTM effectively.
  • Introduced a quantitative metric named J-Index to model academic influence in scientific literatures.
  • Designed experiments on a collection of over 420,000 research papers to validate the effectiveness of J-Index.

Exponential Interest Aggregation in Named Data Networking

Supervised by Prof. Weijia Jia • Apr, 2015 — Jul, 2015

  • Proposed Exponential Interest Aggregation (EIA), an adaptive forwarding strategy addressing hop-by-hop congestion control problem in Named Data Networking (NDN).
  • Established the Interest aggregation state transition framework in NDN, and analyzed the effectiveness of EIA algorithm mathematically under this framework.
  • Conducted simulation to evaluate the performance of EIA, and showed that EIA improved average delay by 13%, average number of retransmission by 25%, and cache hit ratio by 61%

Projects

Accent Classification of English Speakers

This work was advised by Prof. Kai Yu.

  • Solved the accent classification problem through four steps -- word segmentation, feature extraction, clip classification and recording classification.
  • Built a system with Mel-Frequency Cepstral Coefficients (MFCC) as the feature vector and Logistic Regression as the classification method, and achieved an overall classification accuracy of 97%.

PM2.5 Concentration Prediction using Time Series based Data Mining

This work was advised by Prof. Bo Yuan.

  • Formalized PM2.5 prediction problem, an important issue in the control and reduction of pollutants in the air.
  • Applied three methods to deal with PM2.5 prediction problem, including AutoRegressive-Moving-Average (ARMA) model, Stochastic Volatility (SV) model, and Stock-Watson (SW) model.
  • Designed a novel model combining Stock-Watson model with Time Series Neural Network to achieve a better prediction accuracy for PM2.5 concentration in the next 6 hours.

A Map-Generating and Speed-Optimizing Driving System

This work was advised by Prof. Xinbing Wang.

  • Designed and implemented a traffic signal schedule inference model to estimate the duration of traffic light.
  • Integrated this model in a speed-optimizing driving system named CityDrive, which reduced the total waiting time per vehicle by 98.8% in the simulation and saved 58.8% of kinetic energy in real tests.

Static Website Construction

  • Designed and constructed IEEE Honor Class home page, using Bootstrap 3, plain HTML/CSS, and JavaScript.
  • Developed Shanghai Jiao Tong University president home page (Chinese Version), based on Joomla! CMS.
  • Created several static personal home pages and static Blogs, using GitHub Pages combined with Jekyll.

Selected Awards

  • Brian Totty Graduate Fellowship • 2016 -- 2017
  • Tang lixin Scholarship (Top 1%) • 2016 -- 2021
  • Chuntsung Scholarship (Top 1%) • 2015 -- 2016
  • National Scholarship (Top 1%) • 2013 -- 2014 & 2014 -- 2015
  • Academic Excellence Scholarship (Type A) of SJTU (Top 1%) • 2013 -- 2014 & 2014 -- 2015
  • Arawana Scholarship (Top 5%) • 2012-2013
  • “Merit Student” of SJTU • 2012 -- 2013 & 2013 -- 2014
  • Honorable Mention in 2015 Interdisciplinary Contest In Modeling • 2015
  • Second Prize in China Undergraduate Mathematical Contest in Modeling (Shanghai Division) • 2014
  • Third Prize in National Undergraduate Physics Competition (Shanghai Division) • 2013

Selected Extra-curricular Activities

  • Leader of IEEE Honor Class • 2012 -- 2016
  • Personal Tutor of Mathematical Analysis Course • 2012 -- 2013
  • Teaching Assistant of Programming and Data Structures Course • 2013 -- 2014
  • Volunteer for Shanghai International Marathon • 2013 -- 2014
  • Member of the Student Union of the School of Electronic Information and Electrical Engineering • 2013 -- 2015

Skills

English

  • TOEFL 109 ( Reading:30, Listening:29, Speaking:26, Writing:24 )
  • GRE 328 ( Verbal:158, Quantitative:170, Analytical Writing:3.5 )
  • Passed Advanced-Level English Interpretation Accreditation Examination

Computer Languages

  • Python, R, Mathematica, C++, MATLAB, HTML5&CSS3, JavaScript

Tools

  • Git, Latex, Vim, Linux, Keynote, MS Offices, OmniGraffle