Enhanced precision in greenhouse tomato recognition and localization: A study leveraging advances in Yolov5 and binocular vision technologies

Main Article Content

Shuangyou Wang
Zhanying Shao
Yongjian Zhang

Keywords

Improved YOLOv5, Tomato Recognition and Detection, Binocular Camera, Stereo Matching

Abstract

The core challenge in realizing automatic tomato harvesting in greenhouse environments lies in the precise identification and localization of the fruits. This paper introduces a comprehensive approach based on an improved YOLOv5 detection algorithm and optimized binocular stereo vision technology. Firstly, by introducing the C3-Transformer Encoder (CTM) structure and Bidirectional Feature Pyramid Network (Bi-FPN), this study enhanced the model’s ability to recognize tomatoes, especially under complex backgrounds and occlusion conditions. After field testing, the mAP50 accuracy reached 97.1%, an increase of 1.2 percentage points, enhancing detection precision. In addition, the ZED binocular camera was used, and the census stereo matching algorithm was optimized, significantly reducing disparity errors, thereby improving the accuracy of depth information. This allows the model to accurately calculate the three-dimensional spatial position of tomatoes obscured by branches and leaves, greatly improving the efficiency of the harvesting robot. Through field debugging verification with the harvesting robot, the method proposed in this study has shown high accuracy and reliability in the recognition and localization of tomatoes in complex greenhouse environments.

Abstract 84 | PDF Downloads 129 HTML Downloads 0 XML Downloads 1

References

Afonso, M., Fonteijn, H., Fiorentin, F.S., Lensink, D., Mooij, M., Faber, N., et al. 2022. Tomato fruit detection and counting in greenhouses using deep learning. Frontiers in Plant Science 11: 571299. 10.3389/fpls.2020.571299

Arianti, N.D., Muslih, M., Irawan, C., Saputra, E., Sariyusda and Bulan, R., 2023. Classification of harvesting age of mango based on NIR spectra using machine learning algorithms. Mathematical Modelling of Engineering Problems 10(1): 204–211. 10.18280/mmep.100123

Benavides, M., Cantón-Garbín, M., Sánchez-Molina, J.A., and Rodríguez, F., 2020. Automatic tomato and peduncle location system based on computer vision for use in robotized harvesting. Applied Sciences 10: 5887. 10.3390/app10175887

Chen Lv, Li J., Q.Q. Kou, H.D. Zhuang, S.F. Tang. 2021. Stereo matching algorithm based on HSV color space and improved census transform. Mathematical Problems in Engineering 1857327. 10.1155/2021/1857327

Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., et al. 2020. An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929.

Guo, J.C., and Fei, Y.L., 2010. Research on vein image preprocessing based on NiBlack algorithm. The Ninth International Conference on Information and Management Sciences 8: 190–197.

Hartley, R., 2003. Multiple view geometry in computer vision (2nd edition). Cambridge: University Press. pp. 1–532. 10.1017/CBO9780511811685

Hou, Y., Liu, C., An, B., and Liu, Y., 2022. Stereo matching algorithm based on improved census transform and texture filtering. Optik 249: 168186. 10.1016/j.ijleo.2021.168186

Liu, G., Nouaze, J.C., Mbouembe, P.L.T., and Kim, J.H., 2020. YOLO-tomato: a robust algorithm for tomato detection based on YOLOv3. Sensors 20(7): 2145. 10.3390/s20072145

Liu, Y., Lv, B., Wang, Y., and Huang, W., 2020. An end-to-end stereo matching algorithm based on improved convolutional neural network. Mathematical Biosciences and Engineering 17(6): 7787–7803. 10.3934/mbe.2020396

Magalhaes, S.A., Castro, L., Moreira, G., Santos, F.N., Cunha, M., Dias, J., et al. 2021. Evaluating the single-shot multibox detector and YOLO deep learning models for the detection of tomatoes in a greenhouse. Sensors 10: 3569–3593. 10.3390/s21103569

Miao, Z., Xu, X., and Li, N., 2023. Efficient tomato harvesting robot based on image processing and deep learning. Precision Agriculture 25: 254–287. 10.1007/s11119-022-09944-w

Mirhaji H.M., Soleymani M., Asakerehet A., Mehdizadeh S.A., 2021. Fruit detection and load estimation of an orange orchard using the YOLO models through simple approaches in different imaging and illumination conditions. Computers and Electronics in Agriculture 191: 106533. 10.1016/j.compag.2021.106533

Müller, T., Rabe, C., Rannacher, J., Franke, U., and Mester, R., 2011. Illumination-robust dense optical flow using census signatures. The 33rd Joint Pattern Recognition Symposium pp. 236–245. 10.1007/978-3-642-23123-0_24

Naik, A.J., and Thimmaiah, G.M., 2021. Detection and localization of anamoly in videos using fruit fly optimization-based self-organized maps. International Journal of Safety and Security Engineering 11(6): 703–711. 10.18280/ijsse.110611

Qi, J., and Liu, L., 2022. The stereo matching algorithm based on an improved adaptive support window. IET Computer Vision 16(10): 2803–2816. 10.1049/ipr2.12527

Qazi, M.H., Chang H.L., Shanq-Jang, R., and Derlis, G., 2022. An edge-aware based adaptive multi-feature set extraction for stereo matching of binocular images. Journal of Ambient Intelligence and Humanized Computing 13: 1953–1967. 10.1007/s12652-021-02958-8

Rajpoot, V., Dubey, R., Mannepalli, P.K., Kalyani, P., Maheshwari, S., Dixit, A., et al. 2022. Mango plant disease detection system using hybrid BBHE and CNN approach. Traitement du Signal 39(3): 1071–1078. 10.18280/ts.390334

Ratha A.K., Barpanda, N.K., Sethy, P.K., and Behera, S.K., 2023. Papaya fruit maturity estimation using Wavelet and ConvNET. Ingénierie des Systèmes d’Information 28(1): 175–181. 10.18280/isi.280119

Redmon, J., and Farhadi, A., 2017. YOLO9000: better, faster, stronger. Conference on Computer Vision and Pattern Recognition (CVPR) 2017: 690. 10.1109/CVPR.2017.690

Redmon, J., and Farhadi, A., 2018. Yolov3: an incremental improvement. arXiv preprint arXiv:1804.02767.

Selçuk, T., and Tütüncü, M.N., 2023. A raspberry pi-guided device using an ensemble convolutional neural network for quantitative evaluation of walnut quality. Traitement du Signal 40(5): 2283–2289. 10.18280/ts.400546

Su, F., Zhao, Y., Wang, G., Liu, P., Yan, Y., and Zu, L., 2022. Tomato maturity classification based on SE-YOLOv3-MobileNetV1 network under nature greenhouse environment. Agronomy-Basel 12(7): 1638. 10.3390/agronomy12071638

Tan, M., Pang, R., and Le, Q., 2020. Efficientdet: scalable and efficient object detection. In: IEEE Conference on Computer Vision and Pattern Recognition. 2020. 10.1109/CVPR42600.2020.01079

Tian, Y., Yang, G., Wang, Z., Wang, H., Li, E., and Liang, Z., 2019. Apple detection during different growth stages in orchards using the improved YOLO-V3 model. Computers and Electronics in Agriculture 157: 417–426. 10.1016/j.compag.2019.01.012

Trinh, T.H., and Nguyen, H.H.C., 2023. Implementation of YOLOv5 for real-time maturity detection and identification of pineapples. Traitement du Signal 40(4): 1445–1455. 10.18280/ts.400413

Wang, S.Y., Gao, G.H., and Shuai, C.Y., 2023. Study on feedback and correction of tomato picking localization information. Traitement du Signal 40(1): 81–90. 10.18280/ts.400107

Xu, H., Liu, X., Zhu, C., Li, S., and Chang, H., 2017. A real-time ranging method based on parallel binocular vision. International Symposium on Computational Intelligence and Design pp.183–187. 10.1109/ISCID.2017.33

Yu, Y., Zhang, K., Yang, L., and Zhang, D., 2019. Fruit detection for strawberry harvesting robot in non-structural environment based on Mask-RCNN. Computers and Electronics in Agriculture 163. 10.1016/j.compag.2019.06.001

Zhang, J., Zhang, Y., Wang, C., Yu, H., and Qin, C., 2021. Binocular stereo matching algorithm based on MST cost aggregation. Mathematical Biosciences and Engineering 18(4): 3215–3226. 10.3934/mbe.2021160

Zhang, Y., 2023. Information acquisition method of tomato plug seedlings based on cycle-consistent adversarial network. Acadlore Transactions on AI and Machine Learning 2(1): 46–54. 10.56578/ataiml020105

Zhang, Z., 2000. A flexible new technique for camera calibration. IEEE transactions on pattern analysis and machine intelligence 22(11): 1330–1334. 10.1109/34.888718

Zhang, Z., Kai, X., Wu, Y., Zhang, S., and Qi, Y., 2022. A simple and precise calibration method for binocular vision. Measurement Science and Technology 33(6): 65016. 10.1088/1361-6501/ac4ce5

Zheng, T., Jiang, M., Li, Y., and Feng, M., 2022. Research on tomato detection in natural environment based on RC-YOLOv4. Computers and Electronics in Agriculture 198: 107029. 10.1016/j.compag.2022.107029