Projects / AI Innovation Challenge

Violence Detection Between Children and Caregivers Using Computer Vision

Challenge Completed!

Omdena Featured image

A team of 50 Omdena AI changemakers collaborated with Israel-based startup EyeKnow AI to apply deep learning to build a computer vision model for violence detection. The model can help not only detect but in the future also prevent violent behaviour applied to children by caregivers.

The problem

Child maltreatment presents a substantial public health concern. Estimates using Child Protective Service (CPS) reports from the National Child Abuse and Neglect Data System (NCANDS) suggest that 678,810 youth were subjected to maltreatment in 2012, with 18% of these experiencing physical abuse (). Additionally, a large proportion of cases are undetected by CPS, suggesting that more youth are likely subjected to abusive or neglectful behavior (). Most seriously, maltreatment was responsible for an estimated 1,640 youth fatalities in 2012 ().

The project outcomes 

The data

Two datasets, one is a caregiver-to-senior violence dataset, made out of 500 clips sourced entirely from YouTube. The 2nd dataset comprises 500 clips of caregiver-to-child aggression/violence, driven by YouTube clips and unique data obtained through partnerships with EyeKnow’s partners. 

The machine learning models

The contributors of the challenge defined several approaches to build a model to detect violent interaction or any relevant interaction between the entities (caregivers, elderly, children). The first step of this approach was to see the entities, which the team did by utilizing object detection.

The team applied frame-level entity annotation to label the caregivers, children, and elderly. After this step, the collaborators trained an object detection model and implemented an ML pipeline. This pipeline ingests video recordings from CCTV or other sources and outputs frame-level information about the number and type of entities on the frame level. In addition, bounding box-based overlap analysis was included in the pipeline, which flags frames that potentially contain interaction of high intensity (potentially violent). 

Next to this pipeline, the team applied video classification modeling utilizing deep neural networks. This approach combined pre-trained models for feature extraction with sequence modeling to capture temporal relationships. 

All the developed models and approaches run in a Python application. The application is highly modular and serves multiple purposes. By modifying a configuration file (parameters JSON file), the user can execute training of component models or manage inference and process video files.

First Omdena Project?

Join the Omdena community to make a real-world impact and develop your career

Build a global network and get mentoring support

Earn money through paid gigs and access many more opportunities

Your Benefits

Address a significant real-world problem with your skills

Get hired at top companies by building your Omdena project portfolio (via certificates, references, etc.)

Access paid projects, speaking gigs, and writing opportunities


Good English

A very good grasp in computer science and/or mathematics

(Senior) ML engineer, data engineer, or domain expert (no need for AI expertise)

Programming experience with Python

Understanding of OCR, Deep Learning, and Computer Vision.

This challenge has been hosted with our friends at

Application Form
media card
Detecting Microorganisms in Water Using Deep Learning
media card
Skin Disease and Condition Detection using Computer Vision and Machine Learning
media card
Analyzing Brain Scan Images for the Early Detection and Diagnosis of Alzheimer's Disease

Become an Omdena Collaborator

media card
Visit the Omdena Collaborator Dashboard Learn More