Remote Sensing ChatGPT: Solving Remote Sensing Tasks with ChatGPT and Visual Models

Recently, there has been a surge in interest in Large Language Models (LLMs), with ChatGPT standing out for its exceptional capabilities in language comprehension, reasoning, and interactive communication. These models have garnered attention from a diverse array of users and researchers across vari...

Full description

Saved in:
Bibliographic Details
Published inIEEE International Geoscience and Remote Sensing Symposium proceedings pp. 11474 - 11478
Main Authors Guo, Haonan, Su, Xin, Wu, Chen, Du, Bo, Zhang, Liangpei, Li, Deren
Format Conference Proceeding
LanguageEnglish
Published IEEE 07.07.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Recently, there has been a surge in interest in Large Language Models (LLMs), with ChatGPT standing out for its exceptional capabilities in language comprehension, reasoning, and interactive communication. These models have garnered attention from a diverse array of users and researchers across various disciplines. While LLMs have demonstrated remarkable proficiency in mimicking human task execution through natural language, their application in remote sensing interpretation remains largely uncharted. Furthermore, the current lack of automation in remote sensing task planning limits the accessibility of these sophisticated interpretation techniques, especially for non-specialists in the field. To bridge this gap, we introduce Remote Sensing ChatGPT, an innovative LLM-driven agent that integrates ChatGPT with a suite of AI-powered remote sensing models to tackle complex interpretation challenges. This system is designed to interpret user requests, delineate task planning based on the functionalities required, execute each subtask sequentially, and compile the final output by synthesizing the results from each stage. Given that LLMs, trained predominantly on natural language, do not inherently comprehend visual elements present in remote sensing imagery, we have devised a method to incorporate visual cues, effectively embedding the visual context of remote sensing images into the ChatGPT framework. With Remote Sensing ChatGPT, users can effortlessly submit a remote sensing image alongside their query and promptly receive detailed interpretation outcomes along with comprehensive linguistic feedback. Experiments and case studies demonstrate that our method is adept at handling a diverse range of remote sensing tasks and has the potential to be expanded to encompass an even wider array of applications with the integration of more advanced models, such as remote sensing foundation model. The code and demo of Remote Sensing ChatGPT is publicly available at https://github.com/HaonanGuo/Remote-Sensing-ChatGPT.
ISSN:2153-7003
DOI:10.1109/IGARSS53475.2024.10640736