Boosting Autonomous Navigation solution based on Deep Learning using new rad-tol Kintex Ultrascale FPGA

In this paper we present an ad-hoc architecture for on-board Deep Learning (DL) network implemented onto rad-tol FPGA, creating building blocks re-usable for different AI solutions or networks to be developed in the future. The problem analysed is based on autonomous descent and landing scenario, tr...

Full description

Bibliographic Details
Main Authors: D. Gogu, F. Stancu, A. Pastor González, D. Fortún Sánchez, D. Gonzalez-Arjona, O. Müler, M. Barbelian, V. Pana
Format: Conference Object
Language:English
Published: Zenodo 2021
Subjects:
Online Access:https://dx.doi.org/10.5281/zenodo.5520545
https://zenodo.org/record/5520545
Description
Summary:In this paper we present an ad-hoc architecture for on-board Deep Learning (DL) network implemented onto rad-tol FPGA, creating building blocks re-usable for different AI solutions or networks to be developed in the future. The problem analysed is based on autonomous descent and landing scenario, trying to compare traditional techniques. The implementation of the Deep Learning algorithm is focused on extraction of features in navigation camera images. The solution in FPGA allows a reduced power consumption while maximizing the execution performance, opposite to some, or many, on-ground solutions. A space representative breadboard is prepared to demonstrate the solution. For training/testing/validating of the Deep Learning (DL) it has been selected the North Pole of the Moon surface, where one trajectory is used to train the DL and a second one to validate the DL. The architecture of the neural network is divided in numerous layers and it is based on Processing Units (PU), where on layer can have multiple PU. Each PU has the possibility to perform operations such as Convolution, MaxPooling and Upsampling. The implementation is compose by a set of PUs, three DSPs and a controller. The controller is responsible of the coordination of reading and writing commands into the external memory, as well as deciding which operation to execute. Whereas lower address of the memory is allocated for storage the parameters necessary for the operations (weights and biases), once the FPGA is initialized, the following memory addresses is reserved for the input images. Moreover, each layer has a fraction of the RAM reserved for its. Autonomous Navigation Based is a complex DL implementation and presents a lot of challenges, but two of them are of outmost importance: FPGA resources and timing performances. Both terms are completely correlated and the design is tailored in order to balance the bottleneck, the output performance based on timing requirements and the number of accesses to external memory. This architecture requires of a huge amount of arithmetical resources and there are not enough resources available, such as DSPs and BRAMs that help performing all operations in a row, not even in the biggest space grade FPGA on the market. The liaison between resources and timing is to find the balance between both. The architecture based on PUs tries to minimize the latency of the operations, especially for the Convolution module, while at the same time minimizing resources. The more PUs, less accesses to external memory are needed. Overall, this paper presents an analysis on the performance of a DL, where visual based navigation algorithms are implemented on a FPGA hardware. The main goal of this activity is to reduce the computational load of on-board processors, starting from architecture, resources and timing analysis between SW and HW implementation.