CO460-High Performance Computing, Jan-May, 2017

Welcome to the CO460/CO482--High Performance Computing course page. Course related information - Objectives, Outcomes, Syllabus and Rubrics can be found here.

Course Syllabus

Instruction Level Parallelism: Pipelining, Hazards, Compiler techniques for ILP, Branch prediciton, Static and Dynamic Scheduling, Speculation, Limits of ILP. Multicore Memory Hierarchy: Cache tradeoffs, Basic and Advanced optimizations, Virtual Memory, DRAM optimizations. Multiprocessors: Symmetric and Distributed architectures, Cache coherence protocols - Snoopy and Directory based, ISA support for Synchronization, Memory Consistency Models. Interconnection Networks: Architectures, Topologies, Performance, Routing, Flow control, Future of NoCs. VLSI: Transistor Theory. Moore's Law. Delay, Power, Energy, Temperature dependence in integrated circuits.

Reference Materials

Reference Books/Textbooks:

  • [HP5e] John Hennessy and David Patterson. Computer Architecture - A Quantitative Approach. 5ed. Morgan Kaufmann.
  • John P. Shen and Mikko H. Lipasti. Modern Processor Design - Fundamentals of Superscalar Processors. Tata McGraw Hill.
  • William J Dally and Brian Towles. Principles and Practices of Interconnection Networks. Morgan Kaufmann. 2004.
  • [SLoCA] Mark Hill/Margaret Martonosi (eds.). Synthesis Lectures on Computer Architecture, Morgan and Claypool, 2006 -- 2016.
  • Important publications in Computer Architecture.

Course Evaluation

Course components: Qtorials - 20%, Programming assignments - 15%, Course project - (proposal, mid-evaluation, final report) 20%, Midsem and Endsem examinations - 45%.

Assignments/Course Project

Submit input, code, screenshots, in an archive. Email to co460.nitk@gmail.com.

Sl. No. AssignmentSubmission Date
1 A1 A1 - Programming Assignment.. Jan 11, 5PM.
2 A2 A2 - Simulation Assignment.. Jan 25, Midnight.
3 Course Projects Information.
M1 Project Proposal Deadline. Feb 28, Midnight.
M2 Project Mid-progress Report Deadline. March 24, Midnight.
M3 Project Final Report Deadline. April 17, Midnight.

Course Schedule

Date/Week
Dec 29 Class Zero.
Technology Trends - Moore's Law, Power trends.
Reading: 1. Chapter 1, HP5e.
2. Kaxiras and Martonosi, Computer architecture techniques for Power-efficiency, SLoCA#4. Chapters 1 and 2.
Programming assignment - A1 out.
Jan 4 Tutorial 1: Qt1 - Questions
Jan 6 Technology Trends - Reliability, Performance Quantification and Simulation.
Reading: 1. Chapter 1, HP5e.
2. Sorin, Fault Tolerant Computer Architecture, SLoCA#5. Chapter 1.
3. Lieven Eeckhout, Computer Architecture Performance Evaluation Methods, SLoCA#10. Chapters 1, 2 and 5.
Jan 11, 5PM Programming assignment - A1 Deadline.
Jan 11. Tutorial 2: Qt2 - Questions
Jan 12 Pipelining - Hazards.
Reading: Appendices A and C. HP5e.
Simulation assignment - A2 is out.
Jan 19 ILP - Scheduling. Pipelining - Exceptions.
Reading: Chapter 3, Appendices A, C, and H. HP5e.
Jan 20. Tutorial 3: Qt3 - Questions
Jan 21. Tutorial 4: Qt4 - Questions
Jan 25. Course Projects Information
Jan 27 Control Dependences, Branch Prediction.
Reading: 1. Chapter 3 and Appendix C. HP5e.
2. S. McFarling. Combining Branch Predictors, Technical Note TN-36, DEC Western Research Laboratory, 1993.
3. Gonzalez, Latorre and Magklis, Processor Microarchitecture -- An Implementation Perspective, SLoCA#12. Chapter 3.
4. T.Y. Yeh, and Y.N. Patt. Alternative Implementations of Two-Level Adaptive Brach Prediction., ISCA, 1992.
Feb 1 Tutorial 5: Qt5 - Questions
Feb 2 Out of order Processors, Dynamic Scheduling, Speculation.
Reading: 1. Chapter 3 and Appendix C. HP5e.
2. Gonzalez, Latorre and Magklis, Processor Microarchitecture -- An Implementation Perspective, SLoCA#12. Chapters 1, 6, 7 and 8.
Feb 8 Tutorial 6: Qt6 - Questions
Feb 9 Multiprocessors - Multithreading, SMP, Distributed Multiprocessors, Snooping and Directory Coherence Protocols.
Reading: 1. Sections 5.1 - 5.4. HP5e.
2. Sorin, Hill and Wood, A Primer on Memory Consistency and Cache Coherence, SLoCA#12. Chapters 1, 2, 6, 7 and 8.
Feb 18. 130PM - 3PM Midsem Exam
Feb 22 Multiprocessors - Implementation of Locks..
Reading: 1. Section 5.5. HP5e.
2. Sorin, Hill and Wood, A Primer on Memory Consistency and Cache Coherence, SLoCA#12. Chapters 1, 2, 6, 7 and 8.
Feb 25 Tutorial 7: Qt7 - Questions
Feb 27. 930AM - 12PM Parallel Graph Algorithms
Speaker: Dr. Rupesh Nasre, CSE, IIT Madras.
Feb 28 Project Proposal Due. Deadline: Midnight.
March 8 Memory Hierarchy - Caches.
Reading: 1. Appendix B, HP5e.
March 9 Tutorial 8: Qt8 - Questions
March 12. 830AM - 1230PM Expert Lecture: Deep Neural Networks on Heterogeneous Parallel Processors (Abstract)
Speaker: Dr. Prakash Raghavendra, HP, Bangalore.
March 14 Cache Aware Programming.
March 16 Virtual Memory.
Reading: 1. Appendix B, HP5e.
March 21 Dynamic Random Access Memory.
Reading: 1. Appendix B, HP5e. 2. Chapter 2. HP5e.
March 22 Advanced Cache Optimizations.
Reading: 1. Chapter 2. HP5e.
March 23 Tutorial 9: Qt9 - Questions
March 28 Interconnection Networks.
Reading: 1. Appendix F, HP5e. 2. Natalie Enright Jerger and Li-Shiuan Peh, On-Chip Networks, SLoCA#8.
April 4 Tutorial 10: Qt10 - Questions
May 2 Final Exam