Enhanced precision in greenhouse tomato recognition and localization: A study leveraging advances in Yolov5 and binocular vision technologies
Main Article Content
Keywords
Improved YOLOv5, Tomato Recognition and Detection, Binocular Camera, Stereo Matching
Abstract
The core challenge in realizing automatic tomato harvesting in greenhouse environments lies in the precise identification and localization of the fruits. This paper introduces a comprehensive approach based on an improved YOLOv5 detection algorithm and optimized binocular stereo vision technology. Firstly, by introducing the C3-Transformer Encoder (CTM) structure and Bidirectional Feature Pyramid Network (Bi-FPN), this study enhanced the model’s ability to recognize tomatoes, especially under complex backgrounds and occlusion conditions. After field testing, the mAP50 accuracy reached 97.1%, an increase of 1.2 percentage points, enhancing detection precision. In addition, the ZED binocular camera was used, and the census stereo matching algorithm was optimized, significantly reducing disparity errors, thereby improving the accuracy of depth information. This allows the model to accurately calculate the three-dimensional spatial position of tomatoes obscured by branches and leaves, greatly improving the efficiency of the harvesting robot. Through field debugging verification with the harvesting robot, the method proposed in this study has shown high accuracy and reliability in the recognition and localization of tomatoes in complex greenhouse environments.
References
Arianti, N.D., Muslih, M., Irawan, C., Saputra, E., Sariyusda and Bulan, R., 2023. Classification of harvesting age of mango based on NIR spectra using machine learning algorithms. Mathematical Modelling of Engineering Problems 10(1): 204–211. 10.18280/mmep.100123
Benavides, M., Cantón-Garbín, M., Sánchez-Molina, J.A., and Rodríguez, F., 2020. Automatic tomato and peduncle location system based on computer vision for use in robotized harvesting. Applied Sciences 10: 5887. 10.3390/app10175887
Chen Lv, Li J., Q.Q. Kou, H.D. Zhuang, S.F. Tang. 2021. Stereo matching algorithm based on HSV color space and improved census transform. Mathematical Problems in Engineering 1857327. 10.1155/2021/1857327
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., et al. 2020. An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929.
Guo, J.C., and Fei, Y.L., 2010. Research on vein image preprocessing based on NiBlack algorithm. The Ninth International Conference on Information and Management Sciences 8: 190–197.
Hartley, R., 2003. Multiple view geometry in computer vision (2nd edition). Cambridge: University Press. pp. 1–532. 10.1017/CBO9780511811685
Hou, Y., Liu, C., An, B., and Liu, Y., 2022. Stereo matching algorithm based on improved census transform and texture filtering. Optik 249: 168186. 10.1016/j.ijleo.2021.168186
Liu, G., Nouaze, J.C., Mbouembe, P.L.T., and Kim, J.H., 2020. YOLO-tomato: a robust algorithm for tomato detection based on YOLOv3. Sensors 20(7): 2145. 10.3390/s20072145
Liu, Y., Lv, B., Wang, Y., and Huang, W., 2020. An end-to-end stereo matching algorithm based on improved convolutional neural network. Mathematical Biosciences and Engineering 17(6): 7787–7803. 10.3934/mbe.2020396
Magalhaes, S.A., Castro, L., Moreira, G., Santos, F.N., Cunha, M., Dias, J., et al. 2021. Evaluating the single-shot multibox detector and YOLO deep learning models for the detection of tomatoes in a greenhouse. Sensors 10: 3569–3593. 10.3390/s21103569
Miao, Z., Xu, X., and Li, N., 2023. Efficient tomato harvesting robot based on image processing and deep learning. Precision Agriculture 25: 254–287. 10.1007/s11119-022-09944-w
Mirhaji H.M., Soleymani M., Asakerehet A., Mehdizadeh S.A., 2021. Fruit detection and load estimation of an orange orchard using the YOLO models through simple approaches in different imaging and illumination conditions. Computers and Electronics in Agriculture 191: 106533. 10.1016/j.compag.2021.106533
Müller, T., Rabe, C., Rannacher, J., Franke, U., and Mester, R., 2011. Illumination-robust dense optical flow using census signatures. The 33rd Joint Pattern Recognition Symposium pp. 236–245. 10.1007/978-3-642-23123-0_24
Naik, A.J., and Thimmaiah, G.M., 2021. Detection and localization of anamoly in videos using fruit fly optimization-based self-organized maps. International Journal of Safety and Security Engineering 11(6): 703–711. 10.18280/ijsse.110611
Qi, J., and Liu, L., 2022. The stereo matching algorithm based on an improved adaptive support window. IET Computer Vision 16(10): 2803–2816. 10.1049/ipr2.12527
Qazi, M.H., Chang H.L., Shanq-Jang, R., and Derlis, G., 2022. An edge-aware based adaptive multi-feature set extraction for stereo matching of binocular images. Journal of Ambient Intelligence and Humanized Computing 13: 1953–1967. 10.1007/s12652-021-02958-8
Rajpoot, V., Dubey, R., Mannepalli, P.K., Kalyani, P., Maheshwari, S., Dixit, A., et al. 2022. Mango plant disease detection system using hybrid BBHE and CNN approach. Traitement du Signal 39(3): 1071–1078. 10.18280/ts.390334
Ratha A.K., Barpanda, N.K., Sethy, P.K., and Behera, S.K., 2023. Papaya fruit maturity estimation using Wavelet and ConvNET. Ingénierie des Systèmes d’Information 28(1): 175–181. 10.18280/isi.280119
Redmon, J., and Farhadi, A., 2017. YOLO9000: better, faster, stronger. Conference on Computer Vision and Pattern Recognition (CVPR) 2017: 690. 10.1109/CVPR.2017.690
Redmon, J., and Farhadi, A., 2018. Yolov3: an incremental improvement. arXiv preprint arXiv:1804.02767.
Selçuk, T., and Tütüncü, M.N., 2023. A raspberry pi-guided device using an ensemble convolutional neural network for quantitative evaluation of walnut quality. Traitement du Signal 40(5): 2283–2289. 10.18280/ts.400546
Su, F., Zhao, Y., Wang, G., Liu, P., Yan, Y., and Zu, L., 2022. Tomato maturity classification based on SE-YOLOv3-MobileNetV1 network under nature greenhouse environment. Agronomy-Basel 12(7): 1638. 10.3390/agronomy12071638
Tan, M., Pang, R., and Le, Q., 2020. Efficientdet: scalable and efficient object detection. In: IEEE Conference on Computer Vision and Pattern Recognition. 2020. 10.1109/CVPR42600.2020.01079
Tian, Y., Yang, G., Wang, Z., Wang, H., Li, E., and Liang, Z., 2019. Apple detection during different growth stages in orchards using the improved YOLO-V3 model. Computers and Electronics in Agriculture 157: 417–426. 10.1016/j.compag.2019.01.012
Trinh, T.H., and Nguyen, H.H.C., 2023. Implementation of YOLOv5 for real-time maturity detection and identification of pineapples. Traitement du Signal 40(4): 1445–1455. 10.18280/ts.400413
Wang, S.Y., Gao, G.H., and Shuai, C.Y., 2023. Study on feedback and correction of tomato picking localization information. Traitement du Signal 40(1): 81–90. 10.18280/ts.400107
Xu, H., Liu, X., Zhu, C., Li, S., and Chang, H., 2017. A real-time ranging method based on parallel binocular vision. International Symposium on Computational Intelligence and Design pp.183–187. 10.1109/ISCID.2017.33
Yu, Y., Zhang, K., Yang, L., and Zhang, D., 2019. Fruit detection for strawberry harvesting robot in non-structural environment based on Mask-RCNN. Computers and Electronics in Agriculture 163. 10.1016/j.compag.2019.06.001
Zhang, J., Zhang, Y., Wang, C., Yu, H., and Qin, C., 2021. Binocular stereo matching algorithm based on MST cost aggregation. Mathematical Biosciences and Engineering 18(4): 3215–3226. 10.3934/mbe.2021160
Zhang, Y., 2023. Information acquisition method of tomato plug seedlings based on cycle-consistent adversarial network. Acadlore Transactions on AI and Machine Learning 2(1): 46–54. 10.56578/ataiml020105
Zhang, Z., 2000. A flexible new technique for camera calibration. IEEE transactions on pattern analysis and machine intelligence 22(11): 1330–1334. 10.1109/34.888718
Zhang, Z., Kai, X., Wu, Y., Zhang, S., and Qi, Y., 2022. A simple and precise calibration method for binocular vision. Measurement Science and Technology 33(6): 65016. 10.1088/1361-6501/ac4ce5
Zheng, T., Jiang, M., Li, Y., and Feng, M., 2022. Research on tomato detection in natural environment based on RC-YOLOv4. Computers and Electronics in Agriculture 198: 107029. 10.1016/j.compag.2022.107029