Large-scale scholarly data mining

Large-scale scholarly data mining

Internship Description

Delve is an on-going project in my group. It is a web-based dataset retrieval and document analysis system. Unlike traditional academic search engines (e.g., Google Scholar) and dataset repositories (UCI repository), Delve is dataset driven and provides a medium for dataset retrieval based on the suitability or usage in a given field. It also visualizes dataset and document citation relationship, and enables users to analyze a scientific document by uploading its full PDF.​​

Deliverables/Expectations

​The internship position is for candidates who can contribute to the system by using machine learning and data mining techniques on the analysis of document text, citation and co-author graphs. Deep learning, graph embedding and graph mining techniques should be explored for improving the search accuracy in the system.​

Faculty Name

Xiangliang Zhang

Field of Study

​Machine learning, data mining