Skip to content

CS203: Advanced Computer Architecture (2022 Fall)

Lecture: MW 9:30a – 10:50a

Where: UCR Campus Student Success Center | Room 329

Schedule and SlidesAssignmentsLogistics

Instructor

Hung-Wei Tseng
email: htseng @ ucr.edu
Office Hours: TuW 11a-12p @ WCH 406 & Zoom (please find the link in eLearn)

Teaching Assistant

Jinyoung Choi
e-mail: jchoi264 @ ucr.edu
Office Hours: M 3p-4p, Th 2p-3p.

Other important links

Course Overview

This course will describe the basics of modern processor operation and techniques to optimize your applications. Topics include computer system performance, instruction set architectures, pipelining, branch prediction, memory-hierarchy design, and a brief introduction to multiprocessor architecture issues.

Text books

Required: Patterson & Hennessy, Computer Architecture: A Quantitative Approach, David Patterson & John Hennessy, Morgan Kaufmann, 6th Edition and assigned research papers.
Required: Other assigned readings throughout the quarter.

Many research papers will require campus network to download. For instructions of connecting to UCR VPN and download research papers, please visit https://library.ucr.edu/using-the-library/technology-equipment/connect-from-off-campus

Grading

  • Assignments25%
    • Assignments will be assigned throughout the course. We will drop your lowest one.
    • Our assignments will be based on Jupyterhub/Jupyter notebooks
  • Peer instruction 5%
    • This class uses “peer instruction” and we REQUIRE each of you to download poll everywhere App or navigate/login to their website during the class
    • You must login with UCRNetID@ucr.edu. If you didn’t do it right, you won’t get credits.
    • You need to answer 50% of the poll questions to receive full credits on the class participation
  • Reading Quizzes15%
    • We will have reading quizzes on eLearn!
    • 2 Attempts, we take the average
  • Midterm 20%
  • Final 35%
    The final will be cumulative.
  • Additional notes about grades in this course
    • Your score will be available on eLearn Your final grade is the weighted average of these grades.
      We do our best to record grades accurately, but you should double-check.
    • Late submission: We do not accept any late submission, including quiz, assignments, projects.
    • Errors in grading: If you feel there has been an error in how an assignment or test was graded, you have one week from when the assignment is return to bring it to our attention. You must submit (via email to the instructor and the appropriate TAs) a written description of the problem. Neither I nor the TAs will discuss regrades without receiving an email from you about it first. For arithmetic errors (adding up points etc.) you do not need to submit anything in writing, but the one week limit still applies.
    • For midterm and final: We do not regrade on a single problem. We will re-grade your whole test. The one week regrading window still applies.
    • Final grades: If you have a problem with your final grade in the course, send me email and we can set up an appointment to discuss it.

Schedule and Slides


TopicReadingSlides — PreviewSlides — ReleaseDueNote
9/26/2022Introduction– G.E. Moore. Cramming More Components Onto Integrated Circuits. Electronics, pp. 114–117, April 19, 1965.

– Chapter 1.1-1.6

– John L. Hennessy and David A. Patterson. 2019. A new golden age for computer architecture. Commun. ACM 62, 2

(1) CS203: The first step toward “performance programming”

Demo

Assignment #0 released
9/28/2022Performance Evaluation (I)– Chapter 1.3 & 1.8-1.9Performance (Preview)(2) Performance (I): How good is “good”?

Demo
Reading Quiz #1Assignment #1 released
10/3/2022Cancelled

10/6/2022Performance Evaluation (II)– M. D. Hill and M. R. Marty. Amdahl’s Law in the Multicore Era. in Computer, vol. 41, no. 7, pp. 33-38, July 2008.
– V. Sze, Y. -H. Chen, T. -J. Yang and J. S. Emer. How to Evaluate Deep Neural Network Processors: TOPS/W (Alone) Considered Harmful. In IEEE Solid-State Circuits Magazine, vol. 12, no. 3, pp. 28-41, Summer 2020.
(Optional) Andrew Davison, Twelve Ways to Fool the Masses When Giving Performance Results on Parallel Computers. in Humour the Computer , MITP, 1995, pp.

(3) Performance (II): One thing right.

Demo
Reading Quiz #2
10/10/2022Performance Evaulation (III)(4) Performance (III): Great Pretender

Demo
Assignment #0 Due 10/7
10/12/2022Memory Hierarchy (1): The Basics– Appendix B.1-B.3Memory Hierarchy (Preview)(5) Memory: (1) Memories bring back you…

Demo
Reading Quiz #3
10/17/2022Memory Hierarchy (2)– Appendix B.1-B.3, 2.3
(6) Memory: (2) The A, B, C,s of the cache

Demo
Assignment #1 Due 10/14
10/19/2022Memory Hierarchy (3): Optimizing Cache Performance Applications– Chapter 2.3

– Norman P. Jouppi. 1990. Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers. SIGARCH Comput. Archit. News 18, 2SI (June 1990), 364–373.

(7) Memory (3): The root causes and remedies.

Demo
Reading Quiz #4

Assignment #2 Released
10/24/2022Memory Hierarchy (4): Programmer’s optimizations
(8) Memory (4): Cache misses and their remedies — the software version

Demo

10/26/2022Virtual Memory– Chapter B.4 & B.5, 2.4Virtual Memory (Preview)(9) Virtual Memory: Just an illusion.

Demo
Assignment #2 Due 10/28
10/31/2022Midterm
11/02/2022Basic Processor Design & Branch Prediction– Chapter 3.3

– Appendix C.1 – C.4

x86-64 Machine-Level Programming
Basic Processor Design (Preview)(10) Basic Processor Design: (1) In the pipeline

Demo
Reading Quiz #5
11/07/2022Advanced Branch Prediction– M. Evers, S. J. Patel, R. S. Chappell and Y. N. Patt, “An analysis of correlation and predictability: what makes two-level branch predictors work,” Proceedings. 25th Annual International Symposium on Computer Architecture (Cat. No.98CB36235), Barcelona, Spain, 1998, pp. 52-61.

– James E. Smith. Retrospective: a study of branch prediction strategies. ISCA ’98: 25 years of the international symposia on Computer architecture (selected papers), New York, NY, USA, 1998, pages 22-23

– Jiménez, Daniel A., and Calvin Lin. “Dynamic branch prediction with perceptrons.” Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture. IEEE, 2001.

– André Seznec and P. Michaud. A case for (partially) TAgged GEometric history length branch prediction. Journal of Instruction Level Parallelism. June 2006.
(11) Branch Prediction: I guess I just feel like

Demo
Reading Quiz #6

Assignment #3 released

11/9/2022OOO Scheduling– Chapter 3.4

– D. Suggs, M. Subramony and D. Bouvier, “The AMD “Zen 2” Processor,” in IEEE Micro, vol. 40, no. 2, pp. 45-52, 1 March-April 2020, doi: 10.1109/MM.2020.2974217.

– K. C. Yeager, “The MIPS R10000 superscalar microprocessor,” in IEEE Micro, vol. 16, no. 2, pp. 28-41, April 1996.
(12) Advanced Branch Prediction (2): I just can’t wait.

Demo
Reading Quiz #7
11/14/2022OOO Scheduling– D. Suggs, M. Subramony and D. Bouvier, “The AMD “Zen 2” Processor,” in IEEE Micro, vol. 40, no. 2, pp. 45-52, 1 March-April 2020, doi: 10.1109/MM.2020.2974217.
– (Optional) M. Arafa et al., “Cascade Lake: Next Generation Intel Xeon Scalable Processor,” in IEEE Micro, vol. 39, no. 2, pp. 29-36, 1 March-April 2019, doi: 10.1109/MM.2019.2899330.
– (Optional) J. Doweck et al.. Inside 6th-Generation Intel Core: New Microarchitecture Code-Named Skylake. in IEEE Micro, vol. 37, no. 2, pp. 52-62, Mar.-Apr. 2017, doi: 10.1109/MM.2017.38.

Dynamic Instruction Scheduling (Preview)
(13) Dynamic Instruction Scheduling: CPUtopia
11/16/2022OOO Scheduling
(14) Programming on Modern Processors

Demo
Assignment #3 Due 11/18
11/21/2022SMT & Chip Multiprocessors– Chapter 3.11

Exploiting choice: instruction fetch and issue on an implementable simultaneous multithreading processor, Dean M. Tullsen, Susan J. Eggers, Joel S. Emer, Henry M. Levy, Jack L. Lo, and Rebecca L. Stamm, ISCA ’96: Proceedings of the 23rd annual international symposium on Computer architecture, New York, NY, USA, 1996, pages 191-202.

– Chapter 5.1 – 5.3, 5.5 & 5.6

The case for a single-chip multiprocessor, Kunle Olukotun, Basem A. Nayfeh, Lance Hammond, Ken Wilson, and Kunyung Chang, SIGPLAN Not. 31(9):2-11, 1996.
(15) Programming on Modern Processors (2) and Parallel Architectures (1)

Demo
Reading Quiz #8
11/23/2022Chip Multiprocessors
& Modern Processors

(16) Parallel Programming

Demo
Assignment 4 Release
11/28/2022Dark Silicon– Chapter 1.7

– H. Esmaeilzadeh, E. Blem, R. St. Amant, K. Sankaralingam and D. Burger. Dark Silicon and the End of Multicore Scaling. In IEEE Micro, vol. 32, no. 3, pp. 122-134, May-June 2012.

– Rakesh Kumar, Keith Farkas, Norm P. Jouppi, Partha Ranganathan, Dean M. Tullsen. Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction. In 36th International Symposium on Microarchitecture, December, 2003.
(17) Parallel programming and dark silicon

Demo

Reading Quiz #9

11/30/2022TPU, FPGA– Adrian M. Caulfield, Eric S. Chung, Andrew Putnam, Hari Angepat, Jeremy Fowers, Michael Haselman, Stephen Heil, Matt Humphrey, Puneet Kaur, Joo-Young Kim, Daniel Lo, Todd Massengill, Kalin Ovtcharov, Michael Papamichael, Lisa Woods, Sitaram Lanka, Derek Chiou, and Doug Burger. 2016. A cloud-scale acceleration architecture. In The 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-49).

– Norman P. Jouppi, Cliff Young, Nishant Patil, David Patterson, Gaurav Agrawal, Raminder Bajwa, Sarah Bates, Suresh Bhatia, Nan Boden, Al Borchers, Rick Boyle, Pierre-luc Cantin, Clifford Chao, Chris Clark, Jeremy Coriell, Mike Daley, Matt Dau, Jeffrey Dean, Ben Gelb, Tara Vazir Ghaemmaghami, Rajendra Gottipati, William Gulland, Robert Hagmann, C. Richard Ho, Doug Hogberg, John Hu, Robert Hundt, Dan Hurt, Julian Ibarz, Aaron Jaffey, Alek Jaworski, Alexander Kaplan, Harshit Khaitan, Daniel Killebrew, Andy Koch, Naveen Kumar, Steve Lacy, James Laudon, James Law, Diemthu Le, Chris Leary, Zhuyuan Liu, Kyle Lucke, Alan Lundin, Gordon MacKean, Adriana Maggiore, Maire Mahony, Kieran Miller, Rahul Nagarajan, Ravi Narayanaswami, Ray Ni, Kathy Nix, Thomas Norrie, Mark Omernick, Narayana Penukonda, Andy Phelps, Jonathan Ross, Matt Ross, Amir Salek, Emad Samadiani, Chris Severn, Gregory Sizikov, Matthew Snelham, Jed Souter, Dan Steinberg, Andy Swing, Mercedes Tan, Gregory Thorson, Bo Tian, Horia Toma, Erick Tuttle, Vijay Vasudevan, Richard Walter, Walter Wang, Eric Wilcox, and Doe Hyun Yoon. 2017. In-Datacenter Performance Analysis of a Tensor Processing Unit. In Proceedings of the 44th Annual International Symposium on Computer Architecture (ISCA ’17). Association for Computing Machinery, New York, NY, USA, 1–12. DOI:https://doi.org/10.1145/3079856.3080246

– John L. Hennessy and David A. Patterson. 2019. A new golden age for computer architecture. Commun. ACM 62, 2

(18) Golden Age of Computer Architectures

Demo
Assignment #4 Due 12/2
12/5/2022Final Exam12a-11:59p, any 180-minute slot you pick. Please find the link on gradescope.



Assignments

Assignment #0
Assignment #1
Assignment #2
Assignment #3
Assignment #4

Project

There is no project for CS203

Integrity Policy

  • Cheating WILL be taken seriously. Doing otherwise is not fair to honest students. It is also not fair to allow the cheater to thing that it is a reasonable alternative in life.
  • Please review the UCR student handbook for more details on Academic Integrity.
  • Anyone copying information or having information copied during a test will receive an F for the class and will not be allowed to drop. They will be reported to their college dean. If you can prove non-cooperative copying took place, your grade may be restored, but you must prove it to the dean–I don’t want to be involved. Anyone caught cheating or falsely representing the work of others on the homework will not be allowed to turn in further homework. Your grade will be based exclusively on the tests with a penalty of 25% OR GREATER applied.
  • We photocopy a random sampling of the exams in order to ensure that students do not modify their tests after they have been returned.
  • Online solutions, etc.: A solutions manual exists for this text. Using it, or any solutions you may find on the internet elsewhere IS CHEATING and will be dealt with accordingly. We know what the solution manual solutions look like. Homework is a small fraction of your grade, so cheating on it is unproductive.

Public Health Regulation

  • Masks are always required in the classroom.
  • If you have any symptom of COVID-19 or seasonal flu, please stay at home.