Scaling Graph Neural Networks to 1000s of GPUs

Apply

Project Description

Graph Neural Networks (GNNs) are a special type of deep neural networks that deal with graphs, instead of the more traditional images. GNNs are used in a variety of applications, from recommendation systems, to social networks, to computer security, to biological networks. The common characteristic is that graphs tend to be large and complex; therefore both training and inference require significant processing power. The goal of this project is to scale GNN training to thousands of GPUs. We will target our new supercomputer, Shaheen III, which is projected to include 2800 Nvidia Hopper super-chips than combine a CPU with a H100 GPU https://www.nextplatform.com/2022/09/26/kaust-hpe-shaheen-iii-supercomputer We will use the latest frameworks, such as Microsoft DeepSpeed, and we will target very large graphs.