This is a 6-week, intensive reading group for getting up to speed with the state of the art in hardware accelerators. The goal is to read 3 related papers every week and discuss them for an hour.
The discussion lead will create a Google form with questions for each paper Friday morning and provide them to the readers with the expectation that readers will fill it out before the group meets.
Q. Why are we reading 3 papers a week? Isn't that a bit too much? A. There are three reasons:
- We want to make paper reading a continuous habit; it is all too easy to ignore in favor of the highest priority deadline.
- We want to incentivize people to read papers early; interesting discussions require readings to stew in our minds for a bit. The goal is to really understand the state of the art in accelerator design and identify what is truly important. This can only be done with deep reading.
- There is a lot of content to get through, and we want to do it in six weeks.
At the end of the day, we do not want to repeat rote arguments about accelerator design in our PL work; we want to truly understand where they come from, so we can work on problems that matter.
Q. Can I join for some reading sessions but not others? A. No. We're hoping to have continuity between the various discussion sessions and want to be able to reference ideas of papers we've already read without needing to explain them. If you're interested in some papers, I recommend joining all the sessions so that you can see the ideas build up to the papers you're interested in.
- (1a) Computing's Energy Problem
- (1b) Understanding sources of inefficiency in general-purpose chips
- (1c) Dark silicon and the end of multicore scaling
- (2a) Why Systolic Architectures?
- (2b) A Domain-Specific Supercomputer for Training Deep Neural Networks
- (2c) Programmatic Control of a Compiler for Generating High-performance Spatial Hardware
- (3a) Triggered instructions: A control paradigm for spatially-programmed architectures
- (3b) Buffets: An Efficient and Composable Storage Idiom for Explicit Decoupled Data Orchestration
- (3c) CoRAM: an in-fabric memory architecture for FPGA-based computing
- (4a) Plasticine: A Reconfigurable Architecture For Parallel Patterns
- (4b) Capstan: A Vector RDA for Sparsity (Note: This might overlap with Plasticine's architecture too much)
- (4c) RipTide: A Programmable, Energy-minimal Dataflow Compiler and Architecture
- (5a) Spatial: a language and compiler for application accelerators
- (5b) SDC-based Modulo Scheduling for Pipeline Synthesis
- (5c) Revet: A Language and Compiler for Dataflow Threads (4b)