A Transformer-Based Multimodal Model for Urban-Rural Fringe Identification

As the frontier of urbanization, urban-rural fringes (URFs) transitionally connect urban construction regions to the rural hinterland, and its identification is significant for the study of urbanization-related socioeconomic changes and human dynamics. Previous research on URF identification has pre...

Full description

Saved in:
Bibliographic Details
Published inIEEE journal of selected topics in applied earth observations and remote sensing Vol. 17; pp. 15041 - 15051
Main Authors Jia, Furong, Dong, Quanhua, Huang, Zhou, Chen, Xiao-Jian, Wang, Yi, Peng, Xia, Guo, Yuan, Ma, Ruixian, Zhang, Fan, Liu, Yu
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:As the frontier of urbanization, urban-rural fringes (URFs) transitionally connect urban construction regions to the rural hinterland, and its identification is significant for the study of urbanization-related socioeconomic changes and human dynamics. Previous research on URF identification has predominantly relied on remote sensing data, which often provides a uniform overhead perspective with limited spatial resolution. As an additional data source, street view images (SVIs) offer a valuable human-related perspective, efficiently capturing intricate transitions from urban to rural areas. However, the abundant visual information offered by SVIs has often been overlooked and multimodal techniques have seldom been explored to integrate multisource data for delineating URFs. To address this gap, this study proposes a transformed-based multimodal methodology for identifying URFs, which includes a street view panorama classifier and a remote sensing classification model. In the study area of Beijing, the experimental results indicate that an URF with a total area of 731.24 <inline-formula><tex-math notation="LaTeX">\text{km}^{2}</tex-math></inline-formula> surrounds urban cores, primarily located between the fourth and sixth ring roads. The effectiveness of the proposed method is demonstrated through comparative experiments with traditional URF identification methods. In addition, a series of ablation studies demonstrate the efficacy of incorporating multisource data. Based on the delineated URFs in Beijing, this research introduced points of interest data and commuting data to analyze the socioeconomic characteristics of URFs. The findings indicate that URFs are characterized by longer commuting distances and less diverse restaurant consumption patterns compared to more urbanized regions. This study enables the accurate identification of URFs through the transform-based multimodal approach integrating SVIs. Furthermore, it provides a human-centric comprehension of URFs, which is essential for informing strategies of urban planning and development.
ISSN:1939-1404
2151-1535
DOI:10.1109/JSTARS.2024.3439429