Abstract
Over the past decade, significant advancements in computer vision have
been made, primarily driven by deep learning-based algorithms for object
detection. However, these models often require large amounts of labeled
data, leading to performance degradation when applied to tasks with
limited datasets, particularly in scenarios involving moving objects.
For instance, real-time detection and detection of humans in
agricultural settings pose challenges that demand sophisticated vision
algorithms. To address this issue, we propose SB-YOLO-V8, an optimized
YOLO-based Convolutional Neural Network (CNN) designed specifically for
real-time human detection in citrus farms. The proposed model is trained
using images and videos of human workers captured by autonomous farm
equipment. The preprocessing stage involves employing data augmentation
techniques and Synthetic Minority Over-sampling Technique (SMOTE) to
enhance object detection performance and prevent overfitting. SB-YOLO-V8
incorporates Binary ALO optimization for improved feature extraction,
enabling high-quality data extraction for classification purposes. The
architecture comprises both the YOLO-based CNN and an aggregator module
for classification and feedback, respectively. Evaluation metrics,
including frame per second (FPS), model performance, and efficiency,
demonstrate the proposed model outperforms variances of YOLO such as
YOLO-V8, YOLO-V7, YOLO-V6, YOLO-V4 and YOLO-V3 with an average FPS of
13.63 and a precision of 91%. In effect, the proposed SB-YOLO-V8
presents an efficient solution for real time human detection in
challenging visual scenarios.