Semantic Segmentation of Plant Structures with Deep Learning and Channel-wise Attention Mechanism

Semantic segmentation of plant images is crucial for various agricultural applications and creates the need to develop more demanding models that are capable of handling images in a diverse range of conditions. This paper introduces an extended DeepLabV3+ model with a channel-wise attention mechanis...

Full description

Saved in:

Bibliographic Details
Published in	Journal of Telecommunications and Information Technology Vol. 99; no. 1; pp. 56 - 66
Main Authors	Surehli, Mukund Kumar, Aggarwal, Naveen, Joshi, Garima, Nayyar, Harsh
Format	Journal Article
Language	English
Published	Warsaw Instytut Lacznosci - Panstwowy Instytut Badawczy (National Institute of Telecommunications) 2025
Subjects	Accuracy Computer vision Datasets Deep learning Image segmentation Lighting Neural networks Outdoors Plant diseases Semantic segmentation Semantics Telecommunications
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Semantic segmentation of plant images is crucial for various agricultural applications and creates the need to develop more demanding models that are capable of handling images in a diverse range of conditions. This paper introduces an extended DeepLabV3+ model with a channel-wise attention mechanism, designed to provide precise semantic segmentation while emphasizing crucial features. It leverages semantic information with global context and is capable of handling object scale variations within the image. The proposed approach aims to provide a well generalized model that may be adapted to various field conditions by training and tests performed on multiple datasets, including Eschikon wheat segmentation (EWS), humans in the loop (HIL), computer vision problems in plant phenotyping (CVPPP), and a custom "botanic mixed set" dataset. Incorporating an ensemble training paradigm, the proposed architecture achieved an intersection over union (IoU) score of 0.846, 0.665 and 0.975 on EWS, HIL plant segmentation, and CVPPP datasets, respectively. The trained model exhibited robustness to variations in lighting, backgrounds, and subject angles, showcasing its adaptability to real-world applications.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1509-4553 1899-8852
DOI:	10.26636/jtit.2025.1.1853