Abstract:Debris flow disasters, characterized by their frequent occurrence and high destructiveness, are
significantly constrained by the inadequate real-time performance and elevated false alarm rates inherent in
conventional monitoring methodologies. This critical limitation underscores the urgent need to develop highly
efficient and precise intelligent detection techniques to substantially enhance early warning capabilities. To
address the challenges of poor real-time performance and high false alarm rates in traditional debris flow
monitoring systems, an enhanced YOLOv8m-GCSlide model is proposed based on the YOLOv8 framework. The
global context attention module (GCNet) is integrated into the backbone network to strengthen spatial
dependency modeling of dynamic fluid boundaries in complex terrains, while a sliding loss function (SlideLoss)
is designed to dynamically adjust classification thresholds and mitigate sample imbalance. Knowledge distillation
is further applied to compress the model, resulting in a lightweight variant (YOLOv8n-GCSlide) with reduced
computational complexity. A multi-source video dataset is constructed using publicly available resources, where
frames are extracted at 0.25-second intervals to balance feature retention and training efficiency. Data
augmentation techniques, including random cropping, rotation, scaling, Gaussian blur, and color jittering, are
employed to enhance generalization, supplemented by negative samples (e.g., dry riverbeds, landslides) to reduce
false positives. Experimental results demonstrate that the optimized model achieves 94.6%(+2.0%) detection
accuracy, 88.0% recall, 95.9% mean average precision, and an inference speed of 244.1 FPS, outperforming
mainstream lightweight models such as SwinTransformer and MobileNet variants. After compression, the model
parameters are reduced by 88.1%, with a distilled version maintaining 94.6%(+1.2%) accuracy and 88.0%
(+0.7%) recall while achieving 244.1 FPS. Field validation in Sedongpu Gully, a high-risk debris flow region,
confirms practical applicability: under complex environmental interference, the model attains 82.3% recall, 4.2%
false positive rate, and 240.6 FPS processing speed. The integration of global attention mechanisms and taskspecific
loss functions is shown to effectively capture dynamic motion features and suppress environmental noise.
Additionally, model compression techniques ensure a balance between accuracy and computational efficiency,
enabling edge deployment for real-time disaster warnings. This approach provides a robust technical foundation
for intelligent geological hazard monitoring systems, emphasizing high precision, low latency, and adaptability to
resource-constrained scenarios.