CS701/IS860 - High Performance Computing, July-Dec, 2016

Welcome to the CS701/IS860 --High Performance Computing course page.

Course Syllabus

Instruction Level Parallelism: Pipelining, Hazards, Compiler techniques for ILP, Branch prediciton, Static and Dynamic Scheduling, Speculation, Limits of ILP. Multicore Memory Hierarchy: Cache tradeoffs, Basic and Advanced optimizations, Virtual Memory, DRAM optimizations. Multiprocessors: Symmetric and Distributed architectures, Cache coherence protocols - Snoopy and Directory based, ISA support for Synchronization, Memory Consistency Models. Interconnection Networks: Architectures, Topologies, Performance, Routing, Flow control, Future of NoCs. VLSI: Transistor Theory. Moore's Law. Delay, Power, Energy, Temperature dependence in integrated circuits.

Reference Materials

Reference Books/Textbooks:

  • [HP5e] John Hennessy and David Patterson. Computer Architecture - A Quantitative Approach. 5ed. Morgan Kaufmann.
  • John P. Shen and Mikko H. Lipasti. Modern Processor Design - Fundamentals of Superscalar Processors. Tata McGraw Hill.
  • William J Dally and Brian Towles. Principles and Practices of Interconnection Networks. Morgan Kaufmann. 2004.
  • [SLoCA] Mark Hill/Margaret Martonosi (eds.). Synthesis Lectures on Computer Architecture, Morgan and Claypool, 2006 -- 2016.
  • Important publications in Computer Architecture.

Course Evaluation

Course components: Qtorials - 20%, Programming assignments - 35%, Midsem and Endsem examinations - 45%.

List of Papers for Paper Reading Sessions

Use this paper summary submission template, and its class file.

Paper Details
1 Hamerly, Perelman, Lau, Calder, Sherwood, Using Machine Learning to Guide Architecture Simulation, J. of Machine Learning Research, 2006..
2 Wunderlich, Wenisch, Falsafi, and Hoe, SMARTS: Accelerating Microarchitecture Simulation via Rigorous Statistical Sampling, ISCA, 2002. OR TurboSMARTS: Accurate Microarchitecture Simulation Sampling in Minutes..
3 Qureshi, et. al., Adaptive Insertion Policies for High Performance Caching, ISCA 2007.
4 Jouppi, Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache and Prefetch Buffers, 1990.
5 Rotenberg, et. al., Trace Cache: a Low Latency Approach to High Bandwidth Instruction Fetching, MICRO 1996.
6 Hadi Esmaeilzadeh, Emily Blem, Renee St. Amant, Karthikeyan Sankaralingam, and Doug Burger. 2011. Dark silicon and the end of multicore scaling. In Proceedings of the 38th annual international symposium on Computer architecture (ISCA '11). ACM, New York, NY, USA, 365-376.
7 Yakun Sophia Shao, Brandon Reagen, Gu-Yeon Wei, and David Brooks. 2014. Aladdin: a Pre-RTL, power-performance accelerator simulator enabling large design space exploration of customized architectures. In Proceeding of the 41st annual international symposium on Computer architecuture (ISCA '14). IEEE Press, Piscataway, NJ, USA, 97-108.
8 William J. Dally and Brian Towles. 2001. Route packets, not wires: on-chip inteconnection networks. In Proceedings of the 38th annual Design Automation Conference (DAC '01). ACM, New York, NY, USA, 684-689..

Assignments/Lab Work

Sl. No. AssignmentSubmission Date
1 A1 A1 - MIPS/x86 Assembly Language Programming.. August 8, 1159PM.
2 A2 A2 - Programming Assignment.. August 19, Midnight.
3 A3 A3 - Simulation Assignment.. August 31, Midnight.
4 A4 A4 - Scheduling, Loop unrolling Assignment. September 26, Midnight.
5 A5 A5 - SIMD-AVX, OpenMP, MPI Programming Assignment. October 3, Midnight.
6 A6 A6 - Memory Hierarchy Assignment. October 24, Midnight.

Course Schedule

Date/Week Type
Week 1 Lecture Class Zero.
Announcement MIPS and x86 Assembly Language programming assignment - A1 out.
Lecture Technology Trends - Moore's Law, Power trends.
Reading: 1. Chapter 1, HP5e.
2. Kaxiras and Martonosi, Computer architecture techniques for Power-efficiency, SLoCA#4. Chapters 1 and 2.
Week 2 Tutorial Tutorial 1 - Questions
Lecture Technology Trends - Reliability, Performance Quantification and Simulation.
Paper discussion Orion, MICRO-35, 2002.
Week 3 Tutorial Tutorial 2 - Questions
Announcement August 8, Midnight: A1 Deadline. Programming assignment - A2 is out.
Lecture Pipelining - Data dependences
Reading Appendices A and C. HP5e.
Week 4 Lecture Pipelining - Control dependences, Branch Prediction.
Reading 1. Chapter 3 and Appendix C. HP5e.
2. S. McFarling. Combining Branch Predictors, Tech. Note TN-36, DEC WRL, 1993.
3. T.Y. Yeh, and Y.N. Patt. Alternative Implementations of Two-Level Adaptive Brach Prediction., ISCA, 1992.
Lecture Exceptions
Week 5 Lecture Dynamic Scheduling
Reading 1. Chapter 3, Appendices A, C, and H. HP5e.
2. Smith and Sohi, The microarchitecture of Superscalar Processors, Proc of IEEE, 1995.
Tutorial Tutorial 3 - Questions
Announcement A3 deadline: Aug 31. Assignments A4 and A5 out.
Week 5/6 Lecture Multiprocessors - SMP, Distributed Multiprocessors, Programming models, Snooping and Directory Coherence Protocols
Tutorial Tutorial 4 - Questions
Tutorial Tutorial 5 - Questions
Tutorial Tutorial 6 - Questions
Week 7 Midsem Exam Sept 9. 330PM
Week 8 Lecture Multiprocessors - Implementation of Locks..
Reading 1. Section 5.5. HP5e.
2. Sorin, Hill and Wood, A Primer on Memory Consistency and Cache Coherence, SLoCA#12. Chapters 1, 2, 6, 7 and 8.
Week 9 Tutorial Tutorial 7: Qt7 - Questions
LectureMemory Hierarchy - Caches.
Reading Appendix B, HP5e.
Week 10 Tutorial Tutorial 8: Qt8 - Questions
LectureMemory Hierarchy - Virtual Memory.
Reading Appendix B, HP5e.
Week 11 Tutorial Tutorial 9: Qt9 - Questions
LectureMemory Hierarchy - Virtual Memory.
Reading Appendix B, HP5e.
Week 12 Lecture Paper Discussion Summary.
Week 13 Dasara - No classes.
Week 14 Paper DiscussionSourabh Jain: Jouppi, Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache and Prefetch Buffers, 1990.
Week 14 Paper DiscussionTejeshwara: J. E. Smith and G. S. Sohi, "The microarchitecture of superscalar processors," in Proceedings of the IEEE, vol. 83, no. 12, pp. 1609-1624, Dec 1995.
Week 15 Paper DiscussionArun Raveendran: Qureshi, et. al., Adaptive Insertion Policies for High Performance Caching, ISCA 2007.
Week 15 Paper DiscussionBhaskar Gautham: Hamerly, Perelman, Lau, Calder, Sherwood, Using Machine Learning to Guide Architecture Simulation, J. of Machine Learning Research, 2006..

Leased Line IP : Load Balanced