C-VIDNET: AMODEL FOR SUPPORTING VIOLENCE DETECTION IN SCHOOLS

Nguyen Viet Hung, Ta Cong Phi , Le Tan Loc, Ngo Quang Khanh, Tran Thanh Nha

Main Article Content

Abstract

School violence is a complex and concerning issue within the education systems of many countries worldwide, including Vietnam. Although various automatic violence detection models based on artificial intelligence have been developed, practical implementation remains challenging due to high complexity and computational costs. To address these limitations, our study proposes the development of a C-ViDNet (Campus Violence Detection Network) model for automatic school violence detection with a small number of parameters, aimed at enhancing detection capabilities and rapid response to violent incidents in educational environments. First, YOLOX is used to identify individuals appearing in the frame. Next, the poses of these individuals are extracted using HRNet and converted into 3D Heatmap Volumes, helping to reduce noise and eliminate unnecessary background elements. Then, a dual-stream architecture is implemented to learn features from the 3D Heatmap Volumes. One stream focuses on the spatial features of the poses, while the other monitors changes in human poses across frames. The results from C-ViDNet demonstrate the potential for developing automatic school violence detection models.  This approach minimizes dependence on manual monitoring and enables timely responses to violent incidents, contributing to safer educational environments.

Article Details

References


TÀI LIỆU THAM KHẢO
Abdali, A. R. (2021). Data efficient video transformer for violence detection. 2021 IEEE International Conference on Communication, Networks and Satellite (COMNETSAT)
(pp. 195-199). IEEE. https://doi.org/10.1109/COMNETSAT53002.2021.9530829
Bermejo Nievas, E., Deniz Suarez, O., Bueno García, G., & Sukthankar, R. (2011). Violence detection in video using computer vision techniques. In Computer Analysis of Images and Patterns: 14th International Conference, CAIP 2011, Seville, Spain, August 29-31, 2011, Proceedings, Part II (pp. 332-339). Springer. https://doi.org/10.1007/978-3-642-22993-9_42
Bianculli, M., Falcionelli, N., Sernani, P., Tomassini, S., Contardo, P., Lombardi, M., & Dragoni, A. F. (2020). A dataset for automatic violence detection in videos. Data in Brief, 33,
Article 106587. https://doi.org/10.1016/j.dib.2020.106587
Divya, A., Lakshmi, D. S., Niveditha, P. L. N., Sri, P. S. N. S., Rohith, V., & Tati, V. B. (2024). Dual-stage deep learning framework for effective public physical violence detection. In 2024 IEEE 13th International Conference on Communication Systems and Network Technologies (CSNT) (pp. 637-642). IEEE. https://doi.org/10.1109/CSNT60213.2024.10545798
Duan, H., Zhao, Y., Chen, K., Lin, D., & Dai, B. (2022). Revisiting skeleton-based action recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 2969-2978). https://doi.org/10.1109/CVPR52688.2022.00298
Ghalley, A., Abdelsalam, A., Dombola, W., & Choudhary, M. S. (2024). Violence detection in automated surveillance using CNN. In 2024 4th International Conference on Intelligent Technologies (CONIT) (pp. 1-6). IEEE. https://doi.org/10.1109/CONIT61985.2024.10626390
Government. (2017). Nghị định số 80/2017/NĐ-CP: Quy định về môi trường giáo dục an toàn, lành mạnh, thân thiện, phòng, chống bạo lực học đường [Decree No. 80/2017/ND-CP: Regulations on a safe, healthy, and friendly educational environment and prevention of school violence].
Halder, R., & Chatterjee, R. (2020). CNN-BiLSTM model for violence detection in smart surveillance. SN Computer Science, 1(4), Article 201. https://doi.org/10.1007/s42979-020-00324-9
Huszár, V. D., Adhikarla, V. K., Négyesi, I., & Krasznay, C. (2023). Toward fast and accurate violence detection for automated video surveillance applications. IEEE Access, 11, 18772-18793. https://doi.org/10.1109/ACCESS.2023.3245521
Islam, Z., Rukonuzzaman, M., Ahmed, R., Kabir, M. H., & Farazi, M. (2021). Efficient two-stream network for violence detection using separable convolutional LSTM. In 2021 International Joint Conference on Neural Networks (IJCNN) (pp. 1-8). IEEE. https://doi.org/10.1109/IJCNN52387.2021.9533586
Khan, M., et al. (2024). Action knowledge graph for violence detection using audiovisual features. In 2024 IEEE International Conference on Consumer Electronics (ICCE) (pp. 1-5). IEEE. https://doi.org/10.1109/ICCE59016.2024.10444158
Kumar, M., Patel, A. K., Biswas, M., & Shitharth, S. (2023). Attention-based bidirectional long short-term memory for abnormal human activity detection. Scientific Reports, 13(1), Article 14442. https://doi.org/10.1038/s41598-023-14442-1
Li, J., Jiang, X., Sun, T., & Xu, K. (2019). Efficient violence detection using 3D convolutional neural networks. In 2019 16th IEEE International Conference on Advanced Video and Signal-Based Surveillance (AVSS) (pp. 1-8). IEEE. https://doi.org/10.1109/AVSS.2019.8909883
Parui, S. K., Biswas, S. K., Das, S., Chakraborty, M., & Purkayastha, B. (2023). An efficient violence detection system from video clips using ConvLSTM and keyframe extraction. In 2023 11th International Conference on Internet of Everything, Microwave Engineering, Communication and Networks (IEMECON) (pp. 1-5). IEEE. https://doi.org/10.1109/IEMECON123456
Rutherford, A., Zwi, A. B., Grove, N. J., & Butchart, A. (2007). Violence: A glossary. Journal of Epidemiology & Community Health, 61(8), 676-680. https://doi.org/10.1136/jech.2005.043711
Santos, F., Durães, D., Marcondes, F. S., Hammerschmidt, N., Lange, S., Machado, J., & Novais, P. (2021). In-car violence detection based on the audio signal. In Intelligent Data Engineering and Automated Learning – IDEAL 2021 (Vol. 13113, pp. 525-535). Lecture Notes in Computer Science. Springer. https://doi.org/10.1007/978-3-030-91608-4_43
Sernani, P., Falcionelli, N., Tomassini, S., Contardo, P., & Dragoni, A. F. (2021). Deep learning for automatic violence detection: Tests on the AIRTLab dataset. IEEE Access, 9, 160580-160595. https://doi.org/10.1109/ACCESS.2021.3051347
Siddique, L. A., Junhai, R., Reza, T., Khan, S. S., & Rahman, T. (2022). Analysis of real-time hostile activity detection from spatiotemporal features using time distributed deep CNNs, RNNs, and attention-based mechanisms. In 2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS) (pp. 1-6). IEEE. https://doi.org/10.1109/IPAS56160.2022.00016
Soliman, M. M., Kamal, M. H., El-Massih Nashed, M. A., Mostafa, Y. M., Chawky, B. S., & Khattab, D. (2019). Violence recognition from videos using deep learning techniques. In 2019 Ninth International Conference on Intelligent Computing and Information Systems (ICICIS)
(pp. 80-85). IEEE. https://doi.org/10.1109/ICICIS46948.2019.9014714
UNICEF. (2021). Protecting children from violence in school. Retrieved September 6, 2024, from https://www.unicef.org/protection/violence-against-children-in-school
World Health Organization, Regional Office for the Eastern Mediterranean. (2024). Violence. Retrieved September 30, 2024, from https://www.emro.who.int/health-topics/violence/index.html
Wu, P., Liu, X., & Liu, J. (2023). Weakly supervised audio-visual violence detection. IEEE Transactions on Multimedia, 25, 1674-1685. https://doi.org/10.1109/TMM.2022.3147369
Yildiz, A. M., Barua, P. D., Dogan, S., Baygin, M., Tuncer, T., Ooi, C. P., Fujita, H., & Acharya, U. R. (2023). A novel tree pattern-based violence detection model using audio signals. Expert Systems with Applications, 224, Article 120031. https://doi.org/10.1016/j.eswa.2023.120031