Scaling CUDA C++ Applications to Multiple Nodes (SCCAMN) – Outline

Detailed Course Outline

Introduction

  • Meet the instructor.
  • Create an account at courses.nvidia.com/join

Multi-GPU Programming Paradigms

  • Survey multiple techniques for programming CUDA C++ applications for multiple GPUs using a Monte-Carlo approximation of pi CUDA C++ program.
  • Use CUDA to utilize multiple GPUs.
  • Learn how to enable and use direct peer-to-peer memory communication.
  • Write an SPMD version with CUDA-aware MPI.

Introduction to NVSHMEM

  • Learn how to write code with NVSHMEM and understand its symmetric memory model.
  • Use NVSHMEM to write SPMD code for multiple GPUs.
  • Utilize symmetric memory to let all GPUs access data on other GPUs.
  • Make GPU-initiated memory transfers.

Halo Exchanges with NVSHMEM

  • Practice common coding motifs like halo exchanges and domain decomposition using NVSHMEM, and work on the assessment.
  • Write an NVSHMEM implementation of a Laplace equation Jacobi solver.
  • Refactor a single GPU 1D wave equation solver with NVSHMEM.
  • Complete the assessment and earn a certificate.

Final Review

  • Learn about application tradeoffs on GPU clusters.
  • Review key learnings and answer questions.
  • Complete the workshop survey.