This term the seminar takes place every Wednesday from 12:30AM to 1:30PM. The meeting takes place in person.
The seminar discusses a broad range of recent systems papers. Papers are selected from typical systems related conferences, including, but not limited to, the following:
General Systems: OSDI, SOSP, NSDI, ATC, EuroSys
Security: USENIX Security, CCS, Oakland, NDSS
Networking: SIGCOMM, INFOCOM, IMC
Architecture: ASPLOS, ISCA, MICRO
Distributed Systems: PODC, ICDCS
Storage: FAST
Some academic terms may have a specfic theme. All the chosen papers are related to that theme.
Each reading group presenter should:
- Send an email reminder to the reading group email with paper details and update the URL in the repository a few days prior to the group meeting (at least two days before).
Date | Discussion Lead | Paper Title and Link | Conference |
---|---|---|---|
March 1, 2024 | Daniel | Practical Byzantine Fault Tolerant (PBFT) / HQ replication: a hybrid quorum protocol for byzantine fault tolerance | OSDI'99 / OSDI'06 |
Date | Discussion Lead | Paper Title and Link | Conference |
---|---|---|---|
September 25, 2023 | Jinkun | Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism / Unity: Accelerating DNN Training Through Joint Optimization of Algebraic Transformations and Parallelization | ArXiv / OSDI'22 |
October 2, 2023 | Lingfan | Orca: A Distributed Serving System for Transformer-Based Generative Models / Efficient Memory Management for Large Language Model Serving with PagedAttention | OSDI'22 / SOSP'23 |
October 9, 2023 | Reading Week | Reading Week | Reading Week |
October 16, 2023 | Jinkun | AStitch: enabling a new multi-dimensional optimization space for memory-intensive ML training and inference on modern SIMT architectures / Welder: Scheduling Deep Learning Memory Access via Tile-graph | ASPLOS'22 / OSDI'23 / |
October 23, 2023 | Haitian | FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness / FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning | NeurIPS'22 / ArXiv |
Design and implementation of cloud scale database has an impact on how many of us build and optimize our systems: for those working in the lower-layers (e.g., on offloads, etc.) this is a common application that might influence designs; for those working on tracing, these systems are often what is used to store and query things, and has an impact on what is stored and why; and in general many of these have intricate algorithms (since they are inherently distributed). Reasoning about what properties they provide, and why, is a fun puzzle.
Date | Discussion Lead | Paper Title and Link | Conference |
---|---|---|---|
October 27, 2020 | John | RedLeaf: Isolation and Communication in a Safe Operating System | OSDI'20 |
November 3, 2020 | Xiangyu | Swift: Delay is Simple and Effective for Congestion Control in the Datacenter | SIGCOMM'20 |
November 10, 2020 | Jessica | Come as You Are: Helping Unmodified Clients Bypass Censorship with Server-side Evasion | SIGCOMM'20 |
November 17, 2020 | Ding Ding | Microsecond Consensus for Microsecond Applications | OSDI'20 |
November 24, 2020 | Thanksgiving | - | - |
December 1, 2020 | Taegyun | Blockene: A High-throughput Blockchain Over Mobile Device | OSDI'20 |
December 8, 2020 | Eric | A Unified Architecture for Accelerating Distributed DNN Training in Heterogeneous GPU/CPU Clusters | OSDI'20 |
December 15, 2020 | Changgeng | Tolerating Slowdowns in Replicated State Machines using Copilots | OSDI'20 |
December 22, 2020 | Anqi | Retiarii: A Deep Learning Exploratory-Training Framework | OSDI'20 |
Date | Discussion Lead | Paper Title and Link | Conference |
---|---|---|---|
February 10, 2020 | Kickoff | - | - |
February 17, 2020 | Fabian | SplitFS: Reducing Software Overhead in File Systems for Persistent Memory | SOSP'19 |
February 24, 2020 | Panda | Helen: Maliciously Secure Coopetitive Learning for LinearModels | S&P'19 |
March 2, 2020 | John | Learning to Reconstruct: Statistical Learning Theory and Encrypted Database Attacks | S&P'19 |
March 9, 2020 | Jinkun | Pretend Synchrony - Synchronous Verification of Asynchronous Distributed Programs | POPL'19 |
March 16, 2020 | Spring Break | Corona | Virus |
March 23, 2020 | Xiangyu | Corona | Virus |
March 30, 2020 | Anirudh | Corona | Virus |
April 6, 2020 | Project Review | Corona | Virus |
April 13, 2020 | Jinyang | Corona | Virus |
April 20, 2020 | Taegyun | Corona | Virus |
April 27, 2020 | Eric | Corona | Virus |
May 4, 2020 | Changgeng | Corona | Virus |
May 11, 2020 | Anqi | Corona | Virus |
May 18, 2020 | Tao | Corona | Virus |