The Development of TIGRA: A Zero Latency Interface For Accelerator Communication in RISC-V Processors

Field programmable gate arrays (FPGA) give developers the ability to design application specific hardware by means of software, providing a method of accelerating algorithms with higher power efficiency when compared to CPU or GPU accelerated applications. FPGA accelerated applications tend to follo...

Full description

Bibliographic Details
Main Author: Green, Wesley Brad
Format: Text
Language:unknown
Published: Clemson University Libraries 2022
Subjects:
Online Access:https://tigerprints.clemson.edu/all_dissertations/2982
https://tigerprints.clemson.edu/cgi/viewcontent.cgi?article=4061&context=all_dissertations
Description
Summary:Field programmable gate arrays (FPGA) give developers the ability to design application specific hardware by means of software, providing a method of accelerating algorithms with higher power efficiency when compared to CPU or GPU accelerated applications. FPGA accelerated applications tend to follow either a loosely coupled or tightly coupled design. Loosely coupled designs often use OpenCL to utilize the FPGA as an accelerator much like a GPU, which provides a simplifed design flow with the trade-off of increased overhead and latency due to bus communication. Tightly coupled designs modify an existing CPU to introduce instruction set extensions to provide a minimal latency accelerator at the cost of higher programming effort to include the custom design. This dissertation details the design of the Tightly Integrated, Generic RISC-V Accelerator (TIGRA) interface which provides the benefits of both loosely and tightly coupled accelerator designs. TIGRA enabled designs incur zero latency with a simple-to-use interface that reduces programming effort when implementing custom logic within a processor. This dissertation shows the incorporation of TIGRA into the simple PicoRV32 processor, the highly customizable Rocket Chip generator, and the FPGA optimized Taiga processor. Each processor design is tested with AES 128-bit encryption and posit arithmetic to demonstrate TIGRA functionality. After a one time programming cost to incorporate a TIGRA interface into an existing processor, new functional units can be added with up to a 75% reduction in the lines of code required when compared to non-TIGRA enabled designs. Additionally, each functional unit created is co-compatible with each processor as the TIGRA interface remains constant between each design. The results prove that using the TIGRA interface introduces no latency and is capable of incorporating existing custom logic designs without modification for all three processors tested. When compared to the PicoRV32 coprocessor interface (PCPI), TIGRA coupled designs ...