A Faster and More Effective Cross-View Matching Method of UAV and Satellite Images for UAV Geolocalization

Cross-view geolocalization matches the same target in different images from various views, such as views of unmanned aerial vehicles (UAVs) and satellites, which is a key technology for UAVs to autonomously locate and navigate without a positioning system (e.g., GPS and GNSS). The most challenging a...

Full description

Saved in:

Bibliographic Details
Published in	Remote sensing (Basel, Switzerland) Vol. 13; no. 19; p. 3979
Main Authors	Zhuang, Jiedong, Dai, Ming, Chen, Xuruoyan, Zheng, Enhui
Format	Journal Article
Language	English
Published	Basel MDPI AG 01.10.2021
Subjects	Accuracy cross-view image matching data collection Datasets deep neural network Feature extraction geolocalization Global navigation satellite system Global positioning systems GPS Image retrieval Inference Methods Performance evaluation Satellite imagery satellites UAV image localization Unmanned aerial vehicles
Online Access	Get full text
ISSN	2072-4292 2072-4292
DOI	10.3390/rs13193979

Cover

More Information
Summary:	Cross-view geolocalization matches the same target in different images from various views, such as views of unmanned aerial vehicles (UAVs) and satellites, which is a key technology for UAVs to autonomously locate and navigate without a positioning system (e.g., GPS and GNSS). The most challenging aspect in this area is the shifting of targets and nonuniform scales among different views. Published methods focus on extracting coarse features from parts of images, but neglect the relationship between different views, and the influence of scale and shifting. To bridge this gap, an effective network is proposed with well-designed structures, referred to as multiscale block attention (MSBA), based on a local pattern network. MSBA cuts images into several parts with different scales, among which self-attention is applied to make feature extraction more efficient. The features of different views are extracted by a multibranch structure, which was designed to make different branches learn from each other, leading to a more subtle relationship between views. The method was implemented with the newest UAV-based geolocalization dataset. Compared with the existing state-of-the-art (SOTA) method, MSBA accuracy improved by almost 10% when the inference time was equal to that of the SOTA method; when the accuracy of MSBA was the same as that of the SOTA method, inference time was shortened by 30%.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	2072-4292 2072-4292
DOI:	10.3390/rs13193979