-
Notifications
You must be signed in to change notification settings - Fork 22
Home
The IBM POWER8 processor is the first to incorporate IBM's Coherent Accelerator Processor Interface (CAPI) technology. This technology allows users to place their own custom IP block into the processor complex in a memory coherent fashion without slowing down bus communication. This is accomplished through two special pieces for logic. The first is the CAPP unit which resides inside the POWER8 chip in the PCIe Host Bridge logic. The second is black box IP logic called POWER Service Layer (PSL) that customers license to put on an FPGA with their own Accelerator Functional Unit (AFU) logic. The AFU logic is used to accelerate user space applications while the PSL/CAPP logic seamlessly handles address translation and memory coherency. This scheme also makes the software interface for accessing FPGA logic greatly simplified in comparison to a traditional FPGA card in a PCIe slot model.
The PSL interface provides a simple design point for hardware designers to design to for access to system memory, interrupts and other functions. On the software side the libcxl library provides a simple API for accessing the AFU. Below is an overview of how the combined software/hardware stack looks for connecting an application to a hardware accelerated function.
- Application
- libcxl library
- Operating System
- POWER8 processor core
- POWER8 CAPP unit
- PCIe Interface
- FPGA PSL unit
- FPGA AFU unit
The design can focus on interfacing software using the libcxl library and interfacing hardware with the PSL unit interface. The PSL Simulation Engine (PSLSE) provides a method for interfacing an application that targets the libcxl library API and attach through a socket connection to an HDL simulator that simulates a model of the AFU logic. The application code can be tested on POWER hardware but can also be tested on other hardware architectures as well. The HDL simulator and application code can be run on the same machine or different machines due to the socket connection. All the blocks in the middle of the stack shown above are abstracted away.