Fruit detection for strawberry harvesting robot in non-structural environment based on Mask-RCNN
•Proposed a strawberry fruit detection algorithm based on Mask R-CNN (MRSD), which has overcome the difficulties of poor universality and robustness using traditional machine vision algorithms in non-structural environment.•ResNet-50, combined with the FPN architecture for feature extraction, had th...
Saved in:
Published in | Computers and electronics in agriculture Vol. 163; p. 104846 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
Amsterdam
Elsevier B.V
01.08.2019
Elsevier BV |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | •Proposed a strawberry fruit detection algorithm based on Mask R-CNN (MRSD), which has overcome the difficulties of poor universality and robustness using traditional machine vision algorithms in non-structural environment.•ResNet-50, combined with the FPN architecture for feature extraction, had the best comprehensive performance of high speed and precision, and was thus chosen as the backbone network of the strawberry target detection model.•Instance segmentation image output from MRSD provides a powerful basis for locating the picking point of strawberry fruit, which is convenient for the precise operation of the harvesting robot.
Deep learning has demonstrated excellent capabilities for learning image features and is widely used in image object detection. In order to improve the performance of machine vision in fruit detection for a strawberry harvesting robot, Mask Region Convolutional Neural Network (Mask-RCNN) was introduced. Resnet50 was adopted as backbone network, combined with the Feature Pyramid Network (FPN) architecture for feature extraction. The Region Proposal Network (RPN) was trained end-to-end to create region proposals for each feature map. After generating mask images of ripe fruits from Mask R-CNN, a visual localization method for strawberry picking points was performed. Fruit detection results of 100 test images showed that the average detection precision rate was 95.78%, the recall rate was 95.41% and the mean intersection over union (MIoU) rate for instance segmentation was 89.85%. The prediction results of 573 ripe fruit picking points showed that the average error was ±1.2 mm. Compared with four traditional methods, the method proposed demonstrates improved universality and robustness in a non-structural environment, particularly for overlapping and hidden fruits, and those under varying illumination. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
ISSN: | 0168-1699 1872-7107 |
DOI: | 10.1016/j.compag.2019.06.001 |