Document Type : Original Article
Department of Computer Engineering and Arti cial Intelligence, Military Technical College, Cairo, Egypt.
Future space missions will rely on novel high-performance computing to support advanced intelligent on-board algorithms with substantial workloads that mandates rm realtime and power constraints requirements. Consequently, these advanced algorithms require signicantly faster processing beyond the conventional space-grade central processing unit
capabilities. Moreover, they require careful selection of the target embedded platform from a diverse set of available architectures along with several implementation tactics to map the algorithms to the target architecture to fully unlock its capabilities. In this paper, we present a study of dierent architectures and embedded computing platforms for the satellite on-board computers. Moreover, we present a comprehensive overview of recent implementation tactics such as source code mapping and transformations. Additionally, we highlight some optimization techniques such as partitioning and co-designing using hardware accelerators. Finally, we discuss several implementation analysis methodologies to derive optimized code implementations. The top ranked YOLO-v3, as a deep learning based object detection algorithm, is selected as a case study model to be optimized using OpenVINO toolkit. The experimental results show an improvement ratios up to 73%, 41%, and 34% in terms of frames per second, CPU utilization, and cache memory, respectively. The study presented in this paper aims to guide the researchers in the eld of high performance embedded computing in terms of dierent hardware architectures along with several implementation tactics.