RT Journal Article SR Electronic(1) A1 Nuhu, Shahana A1 Bedford, Trish A1 Schneider, Jens A1 Househ, MowafaYR 2022 T1 ESRA - Towards Explicability JF QScience Connect, VO 2022 IS Issue 3- Medical Humanities in the Middle East Conference OP SP 38 DO https://doi.org/10.5339/connect.2022.medhumconf.38 PB Hamad bin Khalifa University Press (HBKU Press), SN 2223-506X, AB The Emotion Sensing Recognition App (ESRA) is a mobile app that uses AI to assess children’s drawings with respect to potential mental health issues. This can be an important tool for parents, since children have difficulties articulating many aspects of their mental well-being. In this context, art therapy can help: an art therapist observes children while they are drawing, either guided (i.e., by requesting certain objects in the image) or unguided. Relative size of objects in the image, their placement, type, color scheme etc. then provide valuable clues about the child’s well-being. Still, parents need to make an initial observation before contacting an art therapist. In this work, we explore how technology, in particular AI, can be used as a tool to assist parents in starting a dialogue with the child and, potentially, an art therapist. ESRA is one such technology: parents take a picture of their child’s drawing using their mobile phone, and ESRA provides a binary assessment (“good”, “bad”). However, in addition to only providing binary results, ESRA’s decision process is a black box that cannot be inspected. In contrast, our extended version uses joint-localization and classification to find and identify objects in a child’s drawing. The goal is to provide parents with a verifiable assessment in plain English, summarizing their child’s drawing. To do so, we generated a data set of drawings by conducting a web scrape. We then cleaned and annotated the data set. Under the guidance of the second author of this work, an experienced art therapist, we selected objects in the image and assigned a numerical score ranging from “good” to “bad”. We then used this data to train a “You Only Look Once” (YOLO) artificial neural network (Redmon & Fahadi, 2018). The resulting model predicts the location and score of the objects in an image. We use this information to highlight the objects (i.e., with a bounding box and the label) and summarize the model’s findings in plain English. We pay particular attention to capability awareness by converting numerical scores to quantitative categories such as “We are quite certain that object #7 is a tree without leaves (90%), which could be mildly concerning”. Our tool considers YOLO’s confidence as well the relative size and type of the object. We believe that this approach addresses an important aspect in the ethical use of AI. Deep neural networks are black box oracles. Their decision-making process is not well understood, even by AI experts, let alone lay persons. Our approach therefore allows parents to make an informed decision by attempting to make the recommendation of the AI verifiable. While more studies are needed before using our tool at scale, we think that it can play an important role in initiating a dialogue between children, parents, and art therapists., UL https://www.qscience.com/content/journals/10.5339/connect.2022.medhumconf.38