Education

University of Illinois at Urbana-Champaign

Ph.D. Candidate in Computer Science • Aug. 2016 — May. 2021 (expected)

  • Member of Data Mining Group. Supervised by Professor Jiawei Han.
  • Received Brian Totty Graduate Fellowship at Computer Science Department.

Shanghai Jiao Tong University ( SJTU )

B.S.E. in Computer Science and Technology • Sep. 2012 — Jun. 2016

  • Overall GPA: 3.92/4.0 (91.71/100)     Major GPA: 3.98/4.0 (93.78/100)     Rank: 1/78
  • Member of IEEE honor class, an elite program at SJTU which aims to nurture scientists in computer science, electrical and electronic technology, and information science based on MIT’s educational model.

Yale University

International Exchange Student • Jun. 2014 — Aug. 2014

  • One of 10 top students fully funded by notable alumnus Neil Shen.
  • Studied in the Intensive English Program at English Language Institute.
  • Earned “Certificate of Excellence” at Yale University.

Publications

2017

  • J. Shen, Z. Wu, D. Lei, J. Shang, X. Ren, J. Han, "SetExpan: Corpus-Based Set Expansion via Context Feature Selection and Rank Ensemble", accepted into The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2017). [PDF]
  • X. Ren, J. Shen, M. Qu, X. Wang, Z. Wu, Q. Zhu, M. Jiang, F. Tao, S. Sinha, D. Liem, P. Ping, R. Weinshilboum, J. Han, "Life-iNet: A Structured Network-Based Knowledge Exploration and Analytics System for Life Sciences", accepted into The 55th annual meeting of the Association for Computational Linguistics (ACL 2017) System Demo. [PDF]

2016

  • J. Shen, Y. Jia, X. Liu, Y. Huang, L. Fu, X. Wang, "Overlapping Community Detection in Temporal Text Networks", submitted into Tenth ACM International Conference on Web Search and Data Mining (WSDM 2017). [PDF]
  • J. He, Y. Huang, C. Liu, J. Shen, Y. Jia, X. Wang, "Text network exploration via heterogeneous web of topics", accepted into the Sixth IEEE ICDM Workshop on Data Mining in Networks (ICDM 2016). [PDF] [Demo]

2015

  • J. Shen, Z. Song, S. Li, Z. Tan, Y. Mao, L. Fu, L. Song, X. Wang, "Modeling Topic-level Academic Influence in Scientific Literatures", accepted into Thirtieth AAAI Conference on Artificial Intelligence (AAAI 2016) Workshop on Scholarly Big Data. [PDF] [Slide]
  • J. Shen, Z. Tan, L. Fu, X. Wang, "Trend Analysis of Top-tier Conferences in Computer Network Field", accepted into Communications of the China Computer Federation, 11(9), 62-66, 2015. [PDF]
  • Z. Tan, C. Liu, Y. Mao, J. Shen, B. Wang, L. Fu, L. Song, X. Wang, "AceMap: A Novel Approach towards Displaying Relationship among Academic Literatures", accepted to 25th International World Wide Web Conference (WWW 2016). [PDF] [System]

Research

Automatic Construction of Faceted Taxonomy

Supervised by Prof. Jiawei Han • Aug, 2016 — Present

  • Parse over 80GB Wikidata Knowledge Base and implement an entity linker to Wikidata by integrating phrasal segmentation on raw text and learning to rank techniques on extracted entity features.
  • Apply three methods to construct a faceted taxonomy, including local-global co-occurrence analysis, entity-word embeddings, and recursive hierarchical clustering.

Multi-gene-set Information Retrieval System

Supervised by Prof. Jiawei Han • Aug, 2016 — Present

  • Propose OR-pAND algorithm to re-rank the retrieval output for queries containing a set of multiple genes.
  • Parse over 100GB PMC dataset and implemented OR-pAND algorithm using Whoosh and ElasticSearch.

Overlapping Community Detection in Text Network

Supervised by Prof. Xinbing Wang • Feb, 2016 — Jul, 2016

  • Generated 32 large text networks based on Microsoft Academic Graph and studied their community structures.
  • Proposed an affiliation graph model to capture community interactions and consider information from both link structures and node attributes.
  • Achieved 40% improvements in terms of the accuracy of detected communities on 17 real networks.

Topic-based Academic Information Retrieval System

Supervised by Prof. Xinbing Wang • Sep, 2015 — Jun. 2016

  • Managed a group of 7 people developing a topic-based search engine. Refactored and configured an open source enterprise search platform Solr.
  • Returned paper search results based on both word-level and topic-level similarities with user’s query.
  • Ranked papers according to their influence scores as well as their relevance to the query.
  • Visualized the topic distribution of each paper and topic evolution among the whole corpus.

Modeling Academic Influence in Scientific Literatures

Supervised by Prof. Xinbing Wang • Apr, 2015 — Sep, 2015

  • Devised a generative model named Reference Topic Model (RefTM) to utilize both the textual content and citation information in scientific literatures.
  • Proposed a fast inference algorithm based on collapsed Gibbs Sampling to learn RefTM effectively.
  • Introduced a quantitative metric named J-Index to model academic influence in scientific literatures.
  • Designed experiments on a collection of over 420,000 research papers to validate the effectiveness of J-Index.

Exponential Interest Aggregation in Named Data Networking

Supervised by Prof. Weijia Jia • Apr, 2015 — Jul, 2015

  • Proposed Exponential Interest Aggregation (EIA), an adaptive forwarding strategy addressing hop-by-hop congestion control problem in Named Data Networking (NDN).
  • Established the Interest aggregation state transition framework in NDN, and analyzed the effectiveness of EIA algorithm mathematically under this framework.
  • Conducted simulation to evaluate the performance of EIA, and showed that EIA improved average delay by 13%, average number of retransmission by 25%, and cache hit ratio by 61%

Projects

Accent Classification of English Speakers

This work was advised by Prof. Kai Yu.

  • Solved the accent classification problem through four steps -- word segmentation, feature extraction, clip classification and recording classification.
  • Built a system with Mel-Frequency Cepstral Coefficients (MFCC) as the feature vector and Logistic Regression as the classification method, and achieved an overall classification accuracy of 97%.

PM2.5 Concentration Prediction using Time Series based Data Mining

This work was advised by Prof. Bo Yuan.

  • Formalized PM2.5 prediction problem, an important issue in the control and reduction of pollutants in the air.
  • Applied three methods to deal with PM2.5 prediction problem, including AutoRegressive-Moving-Average (ARMA) model, Stochastic Volatility (SV) model, and Stock-Watson (SW) model.
  • Designed a novel model combining Stock-Watson model with Time Series Neural Network to achieve a better prediction accuracy for PM2.5 concentration in the next 6 hours.

A Map-Generating and Speed-Optimizing Driving System

This work was advised by Prof. Xinbing Wang.

  • Designed and implemented a traffic signal schedule inference model to estimate the duration of traffic light.
  • Integrated this model in a speed-optimizing driving system named CityDrive, which reduced the total waiting time per vehicle by 98.8% in the simulation and saved 58.8% of kinetic energy in real tests.

Static Website Construction

  • Designed and constructed IEEE Honor Class home page, using Bootstrap 3, plain HTML/CSS, and JavaScript.
  • Developed Shanghai Jiao Tong University president home page (Chinese Version), based on Joomla! CMS.
  • Created several static personal home pages and static Blogs, using GitHub Pages combined with Jekyll.

Selected Awards

  • Brian Totty Graduate Fellowship • 2016 -- 2017
  • Tang lixin Scholarship (Top 1%) • 2016 -- 2021
  • Chuntsung Scholarship (Top 1%) • 2015 -- 2016
  • National Scholarship (Top 1%) • 2013 -- 2014 & 2014 -- 2015
  • Academic Excellence Scholarship (Type A) of SJTU (Top 1%) • 2013 -- 2014 & 2014 -- 2015
  • Arawana Scholarship (Top 5%) • 2012-2013
  • “Merit Student” of SJTU • 2012 -- 2013 & 2013 -- 2014
  • Honorable Mention in 2015 Interdisciplinary Contest In Modeling • 2015
  • Second Prize in China Undergraduate Mathematical Contest in Modeling (Shanghai Division) • 2014
  • Third Prize in National Undergraduate Physics Competition (Shanghai Division) • 2013

Selected Extra-curricular Activities

  • Leader of IEEE Honor Class • 2012 -- 2016
  • Personal Tutor of Mathematical Analysis Course • 2012 -- 2013
  • Teaching Assistant of Programming and Data Structures Course • 2013 -- 2014
  • Volunteer for Shanghai International Marathon • 2013 -- 2014
  • Member of the Student Union of the School of Electronic Information and Electrical Engineering • 2013 -- 2015

Skills

English

  • TOEFL 109 ( Reading:30, Listening:29, Speaking:26, Writing:24 )
  • GRE 328 ( Verbal:158, Quantitative:170, Analytical Writing:3.5 )
  • Passed Advanced-Level English Interpretation Accreditation Examination

Computer Languages

  • Python, R, Mathematica, C++, MATLAB, HTML5&CSS3, JavaScript

Tools

  • Git, Latex, Vim, Linux, Keynote, MS Offices, OmniGraffle