The Crowd Analysis tool aims to recognise violent scenes from crowd-centred visual data, especially video files. The results of this tool are depicted below and show (i) the “Prediction” values and (ii) the bounding box, which are both are colourised gradually using a colour bar (from green to red), where the red colour indicates scenes predicted as being violent with a 100% confidence score.
This Crowd Analysis tool has been built on the basis of our recent research work . Its model architecture is based on deep learning methods and, in particular, on a 3D-Convolutional Neural Network (CNN) ResNet with 50 layers  that has been trained using appropriate video footage  that includes crowd violence scenes. For the analysis, 16 video frames are processed in each step, and the tool assigns to them a probability that predicts the level of violence in the crowd. The tool allows users to try it out in a local installation using their own videos, i.e., without transferring their videos through the Internet to a remote server.
Moreover, the tool can also analyse visual streams in a (near) real-time depending on the available hardware (i.e., CPU cores, GPU support etc.); in case the available hardware does not include a GPU, such (near) real-time processing is very difficult to achieve. Video streams can be provided locally and also over the Internet, but due to the dangers of using internet access to transfer the videos, this version on the tool is restricted to the processing of standalone local video files. Please contact us for further information if you are interested in the (near) real-time analysis of video streams.
These are the basic requirements for installing this tool on a PC:
- Supported OS: Ubuntu/Windows
- Processor: Any CPU with 4 cores or more
- RAM: 8 GigaBytes (GB) or more
- Hard drive size: 30GB of free space
- Graphics card: NVIDIA GPU with 8GB memory or more for (near) real-time processing
If you are interested in downloading and testing the software feel free to contact us.
- Gkountakos, K., Ioannidis, K., Tsikrika, T., Vrochidis, S., & Kompatsiaris, I. (2020, June). A Crowd Analysis Framework for Detecting Violence Scenes. In Proceedings of the 2020 International Conference on Multimedia Retrieval (pp. 276-280).
- Hara, K., Kataoka, H., & Satoh, Y. (2018). Can spatiotemporal 3d CNNs retrace the history of 2d CNNs and ImageNet?. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition (pp. 6546-6555).
- Hassner, T., Itcher, Y., & Kliper-Gross, O. (2012, June). Violent flows: Real-time detection of violent crowd behavior. In 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (pp. 1-6). IEEE.