Graph Neural Networks for Science and Engineering


Project Description

Lots of data in scientific and engineering applications come with natural graph structures such as molecules in Chemistry, proteins in Biology, particles in Physics, planets in Astronomy, Bulk and MOF materials in Material Science, social and citation networks in Data Science, point clouds and meshes in Computer Vision and Graphics and so on. To model the information of such objects with complex structures, Graph Machine Learning, especially Graph Neural Networks (GNNs), has been proven as one of the most promising tools. Graph neural networks are deep learning architectures that can be trained to represent graphs with node features and edge features. For example, a molecule can be represented by a graph, where each atom is a node in the graph, and each bond is an edge. The atom numbers and types of chemical bonds are the associated node and edge features respectively. A GNN model can be trained to predict the quantum properties by learning on density functional theory (DFT) datasets which has huge potential to advance scientific discovery. Our group at IVUL have developed methods for graphs in 3D vision, videos, data mining, and fundamental science. We have developed GNNs with more than 100 layers with DeepGCNs (ICCV’2019, TPAMI’2021), and PU-GCN (CVPR’2021) for 3D point clouds segmentation and generation, G-TAD (CVPR'2020), VLG-Net (ICCVW’2021), MAAS (ICCV’2021) and VSGN (ICCV’2021) for large-scale video understanding, and DeeperGCN (arXiv’2020), 1000-layer GNN (ICML’2021) and FLAG (CVPR'2022) for node, link and graph level property prediction on Open Graph Benchmark (OGB) datasets which have graphs span nature, society and information domains. Are you excited about working on complex graph-structured data to make advances in biology, chemistry, physics, computer science, and so on? Would you like to use artificial intelligence to make fast predictions about the 3D structure of molecules, thereby speeding up the drug discovery process? Are you motivated by applications to precision medicine, and would like to create AI that learns to recommend what specific drug is suitable for a particular patient? Or perhaps you are more interested in higher-level abstractions, and would like to build an AI-based partial differential equation solver. All of these complex problems can be modeled through graph-structured data, and research in Graph Neural Networks can bring us closer to solving them. GNN has untapped potential in tackling graph based problems in Science and Engineering. However, more work is needed to explore the unique challenges to each scientific domain. In this project, you will have the chance to learn how to build large-scale graph neural networks and apply them to scientific and engineering applications.
Program - Computer Science
Division - Computer, Electrical and Mathematical Sciences and Engineering
Center Affiliation - Visual Computing Center
Field of Study - machine learning, AI for science

About the

Bernard Ghanem

Professor, Electrical and Computer Engineering

Bernard Ghanem
Professor Ghanem's research interests focus on topics in computer vision, machine learning, and image processing. They include:
  • Modeling dynamic objects in video sequences to improve motion segmentation, video compression, video registration, motion estimation, and activity recognition.
  • Developing efficient optimization and randomization techniques for large-scale computer vision and machine learning problems.
  • Exploring novel means of involving human judgment to develop more effective and perceptually-relevant recognition and compression techniques.
  • Developing frameworks for joint representation and classification by exploiting data sparsity and low-rankness.

Desired Project Deliverables

(i) Identifying the ground challenges of graph based problems in Science and Engineering; (ii) Collecting or processing the desired data into graph formats; (iii) Proposing novel GNN architectures and training techniques to tackle the challenges of learning on these graph data; (iv) Training and evaluating the proposed methods on specific metrics; (v) Producing well-performing and reproducible results and releasing the modular and reusable codebase to the research community.