CS701 - High Performance Computing, July-Dec, 2017

Welcome to the CS701 -- High Performance Computing course page. Course related information - Objectives, Outcomes, Syllabus and Rubrics can be found here.

Course Syllabus

  • Graphics Processing Units. GPU architecture. Thread hierarchy. GPU Memory Hierarchy. GPGPU Programming.
  • Many Integrated Cores. MIC, Xeon Phi architecture. Memory Hierarchy. Memory Bandwidth and performance considerations. Xeon Phi Programming.
  • Shared Memory Parallel Programming. Symmetric and Distributed architectures. OpenMP - Thread creation, Parallel regions. Worksharing, Synchronization.
  • Message Passing Interface - MPI Introduction. Collective communication. Data grouping for communication.
  • Important publications in PACT, IPDPS, and similar.

Reference Materials

Reference Books/Textbooks:

  • Wen-Mei W Hwu, David B Kirk, Programming Massively Parallel Processors A Hands-on Approach, Morgann Kaufmann, 3e.
  • Rezaur Rahman, Intel Xeon Phi Coprocessor Architecture and Tools, Apress Open, 2013.
  • Barbara Chapman, Gabriele Jost, Ruud van der Pas, Using OpenMP, MIT Press, 2008.
  • Gropp, Lusk, Skjellum, Using MPI, Using MPI, 2014.
  • Recent publications in IPDPS, PACT, and similar.

Online Courses (A lot of online material is available. Go ahead and find ones that match your taste.)

Course Evaluation

Course components: Programming assignments, Programming tests, Course project, Midsem, Endsem examination. Your grade will rely heavily on your course project.

Assignments/Lab Work

Course Schedule

Date/Week Type Slides Class Slides
Week 0 Lecture Course Introduction.. Course Introduction - Annotated.
Announcement Lab Work - A1 and A2 out.
Week 1 Lecture Introduction to Heterogeneous Parallel Computing. Introduction to Heterogeneous Parallel Computing - Annotated
Reading: J. Nickolls and W. J. Dally, "The GPU Computing Era," in IEEE Micro, vol. 30, no. 2, pp. 56-69, March-April 2010.
Lecture Introduction to CUDA C. Introduction to CUDA C - Annotated.
Week 2-3Lecture Multidimensional Kernels - Picture Example Multidimensional Kernels - Picture Example - Annotated
Lecture Matrix Multiplication in CUDA Matrix Multiplication in CUDA - Annotated
Lecture GPU Memory GPU Memory - Annotated
Reading: D. Blythe, ``Rise of the Graphics Processor,'' Proc. IEEE, 96 (5), 2008.
Week 4-5 Lecture Tiled Matrix Multiplication Tiled Matrix Multiplication - Annotated.
Week 6 Lecture Thread scheduling, Warps and Control Divergence Thread scheduling, Warps and Control Divergence - Annotated.
Week 7 Lecture DRAM, Memory Coalescing DRAM, Memory Coalescing - Annotated.
Lecture Convolution - Basic Implementation Convolution - Basic Implementation - Annotated.
Week 8 Lecture Tiled Convolution Implementations Tiled Convolution Implementations - Annotated.
Week 9 Midsem Exam
Week 10-11 Course Projects Announced
Week 11-12 Course Project - Proposal Phase I, Discussions
Week 12 Course Project Proposals Due
Week 13-14 Course Project - Phase II
Week 14 Midprogress Report Due