BACKGROUND
Although effective mental health treatments exist, the ability to match individuals to optimal treatment options is poor and timely assessment of response is difficult. One reason for these challenges is the lack of objective measurement of psychiatric symptoms and behaviors of daily function. Sensors and active tasks enabled by smartphones provide a low-burden, low-cost, and scalable way to capture real-world data from patients that is potentially clinically relevant and could thus augment clinical decision making to improve mental health outcomes and move the field of mental health closer to measurement-based care.
OBJECTIVE
Our aim was to explore the feasibility of conducting a fully remote study on individuals with clinical depression using an Android-based smartphone app to collect subjective and objective measures that may be associated with severity of mood and mood-related symptoms. Goals of the pilot study were: (a) through user-centric design, develop an engaging user interface that would lead to high task adherence, (b) test the quality of collected data from passive sensors and adherence to active tasks (e.g., weekly PHQ-9), (c) start building clinically relevant behavioral measures (“features”) from passive sensors and active inputs, and (d) preliminarily explore connections between these features and depressive mood symptoms.
METHODS
A total of 600 participants were asked to download the study app to join this fully remote, observational, 12-week study. The app passively collected 20 sensor data streams (e.g., ambient audio level, location, inertial measurement units), and participants were asked to complete daily tasks consisting of daily mood and behavioral surveys, and weekly voice diaries and PHQ-9 self-surveys as a validated measure of depression symptoms. Statistical analyses included: (a) univariate pairwise correlations between derived behavioral features (e.g., weekly minutes spent at home, pauses in voice diaries, average ambient audio volume level) and PHQ-9, and (b) employing these behavioral features to construct an L1-penalized multivariate logistic regression model predicting depressed vs. non-depressed PHQ-9 scores (i.e., dichotomized PHQ-9 using 10 as a cutoff).
RESULTS
A total of 415 individuals downloaded and logged into the app, with no reports of significant adverse events or unanticipated problems. Over the course of the 12-week study, these participants completed over 80% of the key clinical self-report outcome measure, the PHQ-9, and audio diaries. Applying data sufficiency rules for minimally necessary daily and weekly data resulted in 3,779 participant-weeks of data across 384 participants. On those data, using a subset of 34 behavioral features, we found that 12 features showed a significant (P ≤ 0.001 adjusted by Benjamini-Hochberg procedure) Spearman correlation with weekly PHQ-9, including voice diary-derived word sentiment and ambient audio levels. Restricting the data to complete cases for the 34 behavioral features, we had available 1,013 participant-weeks from 186 participants. The logistic regression model predicting depression status resulted in a 10-fold cross-validated mean area under the curve (AUC) of 0.649.
CONCLUSIONS
This study finds strong proof-of-concept for the use of a smartphone-based assessment of depression outcomes. Behavioral features derived from passive sensors and active tasks show promising correlations with a validated clinical measure of depression (PHQ-9). Future work is needed to increase scale that may permit derivation of more complex (e.g., non-linear) predictive models and also better handle data missingness.