Volume 2022 Number 3
  • EISSN: 2223-506X


The Emotion Sensing Recognition App (ESRA) is a mobile app that uses AI to assess children’s drawings with respect to potential mental health issues. This can be an important tool for parents, since children have difficulties articulating many aspects of their mental well-being. In this context, art therapy can help: an art therapist observes children while they are drawing, either guided (i.e., by requesting certain objects in the image) or unguided. Relative size of objects in the image, their placement, type, color scheme etc. then provide valuable clues about the child’s well-being. Still, parents need to make an initial observation before contacting an art therapist.

In this work, we explore how technology, in particular AI, can be used as a tool to assist parents in starting a dialogue with the child and, potentially, an art therapist. ESRA is one such technology: parents take a picture of their child’s drawing using their mobile phone, and ESRA provides a binary assessment (“good”, “bad”). However, in addition to only providing binary results, ESRA’s decision process is a black box that cannot be inspected.

In contrast, our extended version uses joint-localization and classification to find and identify objects in a child’s drawing. The goal is to provide parents with a verifiable assessment in plain English, summarizing their child’s drawing. To do so, we generated a data set of drawings by conducting a web scrape. We then cleaned and annotated the data set. Under the guidance of the second author of this work, an experienced art therapist, we selected objects in the image and assigned a numerical score ranging from “good” to “bad”. We then used this data to train a “You Only Look Once” (YOLO) artificial neural network (Redmon & Fahadi, 2018). The resulting model predicts the location and score of the objects in an image. We use this information to highlight the objects (i.e., with a bounding box and the label) and summarize the model’s findings in plain English. We pay particular attention to capability awareness by converting numerical scores to quantitative categories such as “We are quite certain that object #7 is a tree without leaves (90%), which could be mildly concerning”. Our tool considers YOLO’s confidence as well the relative size and type of the object. We believe that this approach addresses an important aspect in the ethical use of AI. Deep neural networks are black box oracles. Their decision-making process is not well understood, even by AI experts, let alone lay persons. Our approach therefore allows parents to make an informed decision by attempting to make the recommendation of the AI verifiable. While more studies are needed before using our tool at scale, we think that it can play an important role in initiating a dialogue between children, parents, and art therapists.


Article metrics loading...

Loading full text...

Full text loading...


  1. Redmon, J., & Fahadi, A. (2018). YOLOv3: An Incremental Improvement. https://doi.org/10.48550/arXiv.1804.02767
  • Article Type: Research Article
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error