Back to publications

Moving beyond word error rate to evaluate automatic speech recognition in clinical samples: Lessons from research into schizophrenia-spectrum disorders

Paper Details

Published: 2025/08/25

Journal: Psychiatry Research

Volume: Volume 352

Number: 116690

DOI: 10.1016/j.psychres.2025.116690

Natural language processing applications to mental health research depend on automatic speech recognition (ASR) to study large samples and develop scalable clinical tools. To ensure safe and effective implementation, it is crucial to understand performance patterns of ASR for speech from clinical populations. Therefore, this study evaluated ASR performance in N=50 speech samples from individuals with schizophrenia-spectrum disorders, identifying word error rates (WER) ranging from 0.31 to 0.58. Different WER showed systematic variations based on country of birth and severity of positive symptoms. In subsequent NLP analysis, ASR transcripts showed significantly higher GloVe semantic similarity and fewer sentences than manual transcripts as well as weaker correlations between NLP metrics and symptom scores. We considered the potential impact of these differences in three real-world use cases of ASR: electronic health records, voice chatbots, and clinical decision support systems. Overall, we argue that assessing ASR performance requires looking beyond WER alone. In clinical settings, the potential impact of an ASR error is not only influenced by its rate but by its type, meaning and context. Our approach provides guidance on how to evaluate ASR in clinical research, offering guidance for future researchers and developers on key considerations for its implementation.

Authors

Sandra Anna Just

Brita Elvevåg

Ivan Nenchev

Anna-Lena Bröcker

Christiane Montag