Skip to content

CS203: Advanced Computer Architecture (2023 Spring)

Lecture: TuTh 9:30a – 10:50a

Where: OLMH 1208

Schedule and SlidesAssignmentsLogistics

Instructor

Hung-Wei Tseng
email: htseng @ ucr.edu
Office Hours: M 3p-4p W 3:30a-4:30p @ WCH 406

Teaching Assistant

Kuan-Chieh Hsu
e-mail: khsu037 @ ucr.edu
Office Hours: TuTh 3:30p-4:30p and W 10:30a-11:30a.

Other important links

Course Overview

This course will describe the basics of modern processor operation and techniques to optimize your applications. Topics include computer system performance, instruction set architectures, pipelining, branch prediction, memory-hierarchy design, and a brief introduction to multiprocessor architecture issues.

Text books

Required: Patterson & Hennessy, Computer Architecture: A Quantitative Approach, David Patterson & John Hennessy, Morgan Kaufmann, 6th Edition and assigned research papers.
Required: Other assigned readings throughout the quarter.

Many research papers will require campus network to download. For instructions of connecting to UCR VPN and download research papers, please visit https://library.ucr.edu/using-the-library/technology-equipment/connect-from-off-campus

Grading

  • Assignments 25%
    • Assignments will be assigned throughout the course. We will drop your lowest one.
    • Our assignments will be based on Jupyterhub/Jupyter notebooks
  • Peer instruction 5%
    • This class uses “peer instruction” and we REQUIRE each of you to download poll everywhere App or navigate/login to their website during the class
    • You must login with UCRNetID@ucr.edu. If you didn’t do it right, you won’t get credits.
    • You need to answer 50% of the poll questions to receive full credits on the class participation
  • Reading Quizzes 15%
    • We will have reading quizzes on eLearn!
    • 2 Attempts, we take the average
  • Midterm 20%
  • Final 35%
    The final will be cumulative.
  • Additional notes about grades in this course
    • Your score will be available on eLearn Your final grade is the weighted average of these grades.
      We do our best to record grades accurately, but you should double-check.
    • Late submission: We do not accept any late submission, including quiz, assignments, projects.
    • Errors in grading: If you feel there has been an error in how an assignment or test was graded, you have one week from when the assignment is return to bring it to our attention. For items with submissions through gradescope, please create a request through gradescope. For other assignment, you must submit (via email to the instructor and the appropriate TAs) a written description of the problem. Neither I nor the TAs will discuss regrades without receiving a request or an email from you about it first. For arithmetic errors (adding up points etc.) you do not need to submit anything in writing, but the one week limit still applies.
    • For midterm and final: We do not regrade on a single problem. We will re-grade your whole test. The one week regrading window still applies.
    • Final grades: If you have a problem with your final grade in the course, send me email and we can set up an appointment to discuss it.

Schedule and Slides

DateTopicReadings (Required)Preview SlidesSlides & DemoReading Quiz DueAssignment DueNote
4/4/2023Introduction– G.E. Moore. Cramming More Components Onto Integrated Circuits. Electronics, pp. 114–117, April 19, 1965.
– Chapter 1.1-1.6
– John L. Hennessy and David A. Patterson. 2019. A new golden age for computer architecture. Commun. ACM 62, 2

(1) Overture

Demo: Intro
Assignment #0 released 
4/6/2023Performance Evaluation (I)– Chapter 1.3 & 1.8-1.9Performance (Preview) (2) Performance (1): How good is “Good”?

Demo: Programmers & Performance
Reading Quiz #1Assignment #1 Released
4/11/2023Performance Evaluation (II)– M. D. Hill and M. R. Marty. Amdahl’s Law in the Multicore Era. in Computer, vol. 41, no. 7, pp. 33-38, July 2008.

– V. Sze, Y. -H. Chen, T. J. Yang and J. S. Emer. How to Evaluate Deep Neural Network Processors: TOPS/W (Alone) Considered Harmful. In IEEE Solid-State Circuits Magazine, vol. 12, no. 3, pp. 28-41, Summer 2020.

(3) Performance (2): How can we do better?

Demo: Programming Languages/Compliers & Performance
Reading Quiz #2

4/13/2023Performance Evaluation (III)(Optional) Andrew Davison, Twelve Ways to Fool the Masses When Giving Performance Results on Parallel Computers. in Humour the Computer , MITP, 1995, pp.

(4) Performance (3): One thing right

Demo: Amdahl’s Law and Parallel Architectures

Assignment #0 Due 11:59pm 4/13
4/18/2023Performance Evaluation (IV)
& Memory Hierarchy (1): The Basics
– Appendix B.1-B.3Memory Hierarchy (Preview)(5) Performance (4) & Memory (1): Memories bring back

Demo: FLOPs

Demo: Memory hierarchy
Reading Quiz #3

4/20/2023Memory Hierarchy (2): The Basics– Appendix B.1-B.3
– Chapter 2.3

Memory (2): The A, B, Cs of your cache

Demo: Locality and Address Partition

Assignment #1 Due 11:59pm 4/20

4/25/2023Memory Hierarchy (3)– Chapter 2.3
– Norman P. Jouppi. 1990. Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers. SIGARCH Comput. Archit. News 18, 2SI (June 1990), 364–373.

Memory (3): The causes of cache misses and their remedies

Demo: Way-associativity and cache miss rates
Reading Quiz #4

4/27/2023Memory Hierarchy (4): Optimizing Cache Performance Applications

Memory (4): Cache misses and their remedies (cont.)

Demo: data layout



5/2/2023Memory Hierarchy (5): Programmer’s optimizations &
Virtual Memory (I)

Memory (5): Cache misses and their remedies (cont.)

Demo: Loop optimizations & Tiling algorithms
Reading Quiz #5

5/4/2023Virtual Memory (II)– Chapter B.4 & B.5, 2.4

– Thomas W. Barr, Alan L. Cox, and Scott Rixner. 2010. Translation caching: skip, don’t walk (the page table). In Proceedings of the 37th annual international symposium on Computer architecture (ISCA ’10)

x86-64 Machine-Level Programming

– Emily Blem, Jaikrishnan Menon, and Karthikeyan Sankaralingam. 2013. Power struggles: Revisiting the RISC vs. CISC debate on contemporary ARM and x86 architectures. In Proceedings of the 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA) (HPCA ’13). https://minds.wisconsin.edu/handle/1793/64923 
Virtual Memory (Preview)Virtual Memory: Just an illusion

Demo: virtual memory

Assignment #2 Due 11:59pm 5/4/2023
5/9/2023Midterm





5/11/2023Basic Processor Design & Branch Prediction– Chapter 3.3
– Appendix C.1 – C.4

Basic Microprocessor Design & Branch Prediction (Preview)

(11) Basic Processor Design

Demo: is bitwise really faster?
Reading Quiz #6Assignment #3 Released

5/16/2023Branch Prediction

(12) Branch Prediction: I guess I just feel like

Demo: how code becomes branches?
Reading Quiz #7

5/18/2023Branch Prediction (2) & Data hazards– M. Evers, S. J. Patel, R. S. Chappell and Y. N. Patt, “An analysis of correlation and predictability: what makes two-level branch predictors work,” Proceedings. 25th Annual International Symposium on Computer Architecture (Cat. No.98CB36235), Barcelona, Spain, 1998, pp. 52-61.
– James E. Smith. Retrospective: a study of branch prediction strategies. ISCA ’98: 25 years of the international symposia on Computer architecture (selected papers), New York, NY, USA, 1998, pages 22-23
– Jiménez, Daniel A., and Calvin Lin. “Dynamic branch prediction with perceptrons.” Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture. IEEE, 2001.
– André Seznec and P. Michaud. A case for (partially) TAgged GEometric history length branch prediction. Journal of Instruction Level Parallelism. June 2006.
(13) Branch Prediction (2) & Data Hazards: I just can’t wait..

Demo: why sorting?



5/23/2023Data hazards & OOO Scheduling– Chapter 3.4Dynamic Instruction Scheduling (Preview)(14) Data hazards & dynamic instruction scheduling
Reading Quiz #8

Assignment  #4 released


5/25/2023OOO Scheduling– D. Suggs, M. Subramony and D. Bouvier, “The AMD “Zen 2” Processor,” in IEEE Micro, vol. 40, no. 2, pp. 45-52, 1 March-April 2020, doi: 10.1109/MM.2020.2974217.
– K. C. Yeager, “The MIPS R10000 superscalar microprocessor,” in IEEE Micro, vol. 16, no. 2, pp. 28-41, April 1996. 

(15) Instruction Scheduling and Programming on Modern Processors

Demo: Programming on Modern Processors

Assignment #3 Due 11:59pm 5/25/2023
5/30/2023Programming Modern Processors (2) & SMT– Chapter 3.11
– Exploiting choice: instruction fetch and issue on an implementable simultaneous multithreading processor, Dean M. Tullsen, Susan J. Eggers, Joel S. Emer, Henry M. Levy, Jack L. Lo, and Rebecca L. Stamm, ISCA ’96: Proceedings of the 23rd annual international symposium on Computer architecture, New York, NY, USA, 1996, pages 191-202.
– Chapter 5.1 – 5.3, 5.5 & 5.6
– The case for a single-chip multiprocessor, Kunle Olukotun, Basem A. Nayfeh, Lance Hammond, Ken Wilson, and Kunyung Chang, SIGPLAN Not. 31(9):2-11, 1996.
Parallel Architecture (Preview)(16) Programming Modern Processors (2) & Simultaneous Multithreading

Demo: Programming on Modern Processors
Reading Quiz #9

6/1/2023Chip Multiprocessors & Modern Processors– D. Suggs, M. Subramony and D. Bouvier, “The AMD “Zen 2” Processor,” in IEEE Micro, vol. 40, no. 2, pp. 45-52, 1 March-April 2020, doi: 10.1109/MM.2020.2974217.

– M. Evers, L. Barnes and M. Clark, “The AMD Next-Generation “Zen 3” Core,” in IEEE Micro, vol. 42, no. 3, pp. 7-12, 1 May-June 2022, doi: 10.1109/MM.2022.3152788.
https://ieeexplore.ieee.org/document/9718180

– (Optional) M. Arafa et al., “Cascade Lake: Next Generation Intel Xeon Scalable Processor,” in IEEE Micro, vol. 39, no. 2, pp. 29-36, 1 March-April 2019, doi: 10.1109/MM.2019.2899330. – (Optional) J. Doweck et al.. Inside 6th-Generation Intel Core: New Microarchitecture Code-Named Skylake. in IEEE Micro, vol. 37, no. 2, pp. 52-62, Mar.-Apr. 2017, doi: 10.1109/MM.2017.38

(17) Parallel Architecture & Parallel Programming

Demo: Parallel programming



6/6/2023Parallel architectures & Dark Silicon– Chapter 1.7
– H. Esmaeilzadeh, E. Blem, R. St. Amant, K. Sankaralingam and D. Burger. Dark Silicon and the End of Multicore Scaling. In IEEE Micro, vol. 32, no. 3, pp. 122-134, May-June 2012.
– Rakesh Kumar, Keith Farkas, Norm P. Jouppi, Partha Ranganathan, Dean M. Tullsen. Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction. In 36th International Symposium on Microarchitecture, December, 2003.

(18) Parallel Programming & Dark Silicon: Too many things

Demo: Parallel programming is difficult
Reading Quiz #10

6/8/2023TPU, FPGA– Adrian M. Caulfield, Eric S. Chung, Andrew Putnam, Hari Angepat, Jeremy Fowers, Michael Haselman, Stephen Heil, Matt Humphrey, Puneet Kaur, Joo-Young Kim, Daniel Lo, Todd Massengill, Kalin Ovtcharov, Michael Papamichael, Lisa Woods, Sitaram Lanka, Derek Chiou, and Doug Burger. 2016. A cloud-scale acceleration architecture. In The 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-49).
– Norman P. Jouppi, Cliff Young, Nishant Patil, David Patterson, Gaurav Agrawal, Raminder Bajwa, Sarah Bates, Suresh Bhatia, Nan Boden, Al Borchers, Rick Boyle, Pierre-luc Cantin, Clifford Chao, Chris Clark, Jeremy Coriell, Mike Daley, Matt Dau, Jeffrey Dean, Ben Gelb, Tara Vazir Ghaemmaghami, Rajendra Gottipati, William Gulland, Robert Hagmann, C. Richard Ho, Doug Hogberg, John Hu, Robert Hundt, Dan Hurt, Julian Ibarz, Aaron Jaffey, Alek Jaworski, Alexander Kaplan, Harshit Khaitan, Daniel Killebrew, Andy Koch, Naveen Kumar, Steve Lacy, James Laudon, James Law, Diemthu Le, Chris Leary, Zhuyuan Liu, Kyle Lucke, Alan Lundin, Gordon MacKean, Adriana Maggiore, Maire Mahony, Kieran Miller, Rahul Nagarajan, Ravi Narayanaswami, Ray Ni, Kathy Nix, Thomas Norrie, Mark Omernick, Narayana Penukonda, Andy Phelps, Jonathan Ross, Matt Ross, Amir Salek, Emad Samadiani, Chris Severn, Gregory Sizikov, Matthew Snelham, Jed Souter, Dan Steinberg, Andy Swing, Mercedes Tan, Gregory Thorson, Bo Tian, Horia Toma, Erick Tuttle, Vijay Vasudevan, Richard Walter, Walter Wang, Eric Wilcox, and Doe Hyun Yoon. 2017. In-Datacenter Performance Analysis of a Tensor Processing Unit. In Proceedings of the 44th Annual International Symposium on Computer Architecture (ISCA ’17). Association for Computing Machinery, New York, NY, USA, 1–12. DOI:https://doi.org/10.1145/3079856.3080246
– John L. Hennessy and David A. Patterson. 2019. A new golden age for computer architecture. Commun. ACM 62, 2 



Assignment  #4 Due 11:59pm 6/8/2023
6/15/2023
11:30 a.m. – 2:30 p.m.
@ BRNHL B118
Final Exam

BRNHL B118 (not the classroom for lectures)
6/15/2023 11:30 a.m. – 2:30 p.m




https://registrar.ucr.edu/registering/plan-for-final-exams#tuesday_thursday

Assignments

Assignment #0
Assignment #1
Assignment #2
Assignment #3
Assignment #4

Integrity Policy

  • Cheating WILL be taken seriously. Doing otherwise is not fair to honest students. It is also not fair to allow the cheater to thing that it is a reasonable alternative in life.
  • Please review the UCR student handbook for more details on Academic Integrity.
  • Anyone copying information or having information copied during a test will receive an F for the class and will not be allowed to drop. They will be reported to their college dean. If you can prove non-cooperative copying took place, your grade may be restored, but you must prove it to the dean–I don’t want to be involved. Anyone caught cheating or falsely representing the work of others on the homework will not be allowed to turn in further homework. Your grade will be based exclusively on the tests with a penalty of 25% OR GREATER applied.
  • We photocopy a random sampling of the exams in order to ensure that students do not modify their tests after they have been returned.
  • Online solutions, etc.: A solutions manual exists for this text. Using it, or any solutions you may find on the internet elsewhere IS CHEATING and will be dealt with accordingly. We know what the solution manual solutions look like. Homework is a small fraction of your grade, so cheating on it is unproductive.

Public Health Regulation

  • Masks are always required in the classroom.
  • If you have any symptom of COVID-19 or seasonal flu, please stay at home.