Debris flow disasters, known for their frequent occurrence and high destructiveness, are difficult to monitor effectively due to the limited real-time performance and high false-alarm rates of conventional monitoring methods. This critical limitation underscores the urgent need to develop highly efficient and precise intelligent detection techniques to substantially enhance early warning capabilities. To address the challenges of poor real-time performance and high false alarm rates in traditional debris flow monitoring systems, this study proposes an enhanced YOLOv8m-GCSlide model based on the YOLOv8 framework. The GlobalContext Network (GCNet) is integrated into the backbone network to improve spatial dependency modeling of dynamic fluid boundaries in complex terrains, while a Sliding Loss function (SlideLoss) is designed to dynamically adjust classification thresholds and mitigate sample imbalance. Knowledge distillation is applied to compress the model, resulting in a lightweight variant (YOLOv8n-GCSlide) with reduced computational complexity. A multi-source video dataset was constructed using publicly available resources, with frames extracted at 0.25-second intervals to balance feature retention and training efficiency. Data augmentation techniques, including random cropping, rotation, scaling, Gaussian blur, and color jittering, were used to enhance generalization, supplemented with negative samples (e.g., dry riverbeds and landslides) to reduce false positives. Experimental results show that the optimized model achieves 94.6% (+2.0%) detection accuracy, 88.0% recall, 95.9% mean Average Precision (mAP), and an inference speed of 244.1 FPS, outperforming mainstream lightweight models such as SwinTransformer and MobileNet variants. After compression, the model parameters were reduced by 88.1%, with the distilled version retaining 94.6% (+1.2%) accuracy and 88.0% (+0.7%) recall while maintaining an inference speed of 244.1 FPS. Field validation conducted in Sedongpu Gully, a high-risk debris flow region, confirmed the model’s practical applicability. Under complex environmental interference, the model achieved 82.3% recall, 4.2% false positive rate, and a processing speed of 240.6 FPS. The integration of global attention mechanisms and task-specific loss functions effectively captures dynamic motion features and suppress environmental noise. Additionally, model compression techniques help balance accuracy and computational efficiency, enabling edge deployment for real-time disaster warnings. This approach provides a robust technical foundation for intelligent geological hazard monitoring systems, emphasizing high precision, low latency, and adaptability to resource-constrained scenarios.