Satellite Component Semantic Segmentation: Video Dataset and Real-time Pyramid Attention and Decoupled Attention Network
High-accuracy and real-time satellite component semantic segmentation can locate the key satellite components, such as solar panels, to be operated in on-orbit services, which is of great significance for navigation and control. However, to accomplish the above aim, two main challenges remain unsolv...
Saved in:
Published in | IEEE transactions on aerospace and electronic systems Vol. 59; no. 6; pp. 1 - 23 |
---|---|
Main Authors | , , , , , , |
Format | Journal Article |
Language | English |
Published |
New York
IEEE
01.12.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | High-accuracy and real-time satellite component semantic segmentation can locate the key satellite components, such as solar panels, to be operated in on-orbit services, which is of great significance for navigation and control. However, to accomplish the above aim, two main challenges remain unsolved. Firstly, satellite component semantic segmentation algorithms require a large number of images for training; however, on-orbit satellite images are difficult to obtain, especially for a large-scale satellite component video dataset. In addition, high-accuracy semantic segmentation networks require relatively more computation resources which are difficult to be fulfilled in on-orbit tasks. How to build a satellite component semantic segmentation network that meets the requirements of both high-accuracy and real-time on-orbit operation is the key aim to be accomplished in this paper. In this paper, a simulated satellite component dataset consisting of 98 video sequences of 13 satellites, with complex background, various on-orbit illumination and common satellite motion, is proposed, and it has 32402 frames in total. To meet the requirements of both high-accuracy and real-time on-orbit operation, this paper proposes an attention-based real-time network, Pyramid Attention and Decoupled Attention Network (PADAN), which contains an image-based version, PADAN-S, and a video-based version, PADAN-T. The PADAN-S, which mainly adopts pyramid attention calculation on three-layer pyramid features and then performs decoupled attention calculation by considering both row and column attention, is based on AttaNet. The PADAN-T uses a part of the PADAN-S to obtain temporal pyramid features from temporal frames, then performs decoupled attention calculations between the features of output frame and the features at each layer in temporal pyramid. The experimental results show that the PADAN-S and PADAN-T have superior performance compared to other real-time state-of-the-art algorithms in accuracy in both image-based and video-based satellite component semantic segmentation tasks on simulation datasets, and our dataset has a degree of simulating the real on-orbit environment. The PADAN-S can achieve a speed of 10.25 frames per second with image solution of 1280pixels×720pixels on the edge computing device Jetson Xavier, and the PADAN-T can obtain a speed of 7.18 frames per second. |
---|---|
ISSN: | 0018-9251 1557-9603 |
DOI: | 10.1109/TAES.2023.3282608 |