Developing FPGA-accelerated cloud applications with SDAccel: Practice

Por: Coursera . en: ,

  • Reconfigurable cloud infrastructure
    • Distributed systems, data center and cloud architectures are facing the exponential growth in computing requirements and the impossibility for CPU-based solutions to keep pace. Within this context these complex distributed systems have to move toward accelerated computing. Accelerators complement CPU-based architectures and deliver both performance and power efficiency. Moreover, modern data center, as we know, can be used by several different users to serve different workloads and the idea of having an underlying architecture built on reconfigurable technologies seems to provide an ideal fit for these changing, demanding, workloads. This module provides a description of the main cloud computing components and technologies, as well as detailing the current technologies to accelerate cloud computing workloads.
  • On how to accelerate the cloud with SDAccel
    • Within this module we are going to have a first taste on how to gain the best out of the combination of the F1 instances with SDAccel providing some few practical instructions on how to develop accelerated applications on Amazon F1 by using the Xilinx SDAccel development environment. Then, we are going to present what it is necessary to create FPGA kernels, assemble the FPGA program and to compile the Amazon FPGA Image, or AFI. Finally, we will describe the steps and tasks involved in developing a host application accelerated on the F1 FPGA.
  • Summing things up: the Smith-Waterman algorithm
    • Within this module we are going to introduce you to the Smith-Waterman algorithm that we have chosen to demonstrate how to create a hardware implementation of a system based on FPGA technologies using the Xilinx SDAccel design framework. We are going to dig into the details of the algorithm from its data structures to the computation flow. Then we are going to introduce the Roofline model and we are going to use it to analyze the theoretical peak performance and the operational intensity of the Smith-Waterman algorithm.
  • The Smith-Waterman example in details
    • Within this module we are going to dig deeper in the Smith-Waterman algorithm. We are going to implement a first version of the algorithm on a local server with the Xilinx SDAccel design framework. Then we are going to introduce some optimizations to improve performance, in particular we will add more parallelism in the implementation and we will introduce systolic arrays. Moreover, we will explore how we can perform data compression and then we will leverage multiple memory ports to improve memory access speed. Finally, we are going to port our implementation of the Smith-Waterman algorithm on the AWS F1 instances.
  • Course conclusions
    • We are working at the edge of the research in the area of reconfigurable computing. FPGA technologies are not used only as standalone solutions/platforms but are now included into cloud infrastructures. They are now used both to accelerate infrastructure/backend computations and exposed as-a-Service that can be used by anyone. Within this context we are facing the definition of new research opportunities and technologies improvements and the time cannot be better under this perspective. This module is concluding this course but posing interesting questions towards possible future research directions that may also point the students to other Coursera courses on FPGAs.