Do consumer sleep wearables actually capture how rested adolescents feel? A predictive analysis of 179 ABCD Study participants.
Consumer sleep wearables now sit on millions of adolescent wrists, marketed on a premise of objective, accurate sleep measurement. The empirical question is narrower and more difficult: do the metrics these devices report actually capture the dimension of sleep that matters most to people — the felt sense of being rested? My senior thesis tests this question against the most reliable everyday observers of adolescent sleep: the caregivers who live with these subjects.
Data were drawn from the Adolescent Brain Cognitive Development (ABCD) Study, the largest longitudinal investigation of adolescent neurodevelopment in the United States. Each participant wore a Fitbit Charge 2 across the assessment window; for each participant, a caregiver completed a structured battery of sleep questions covering adequacy, restfulness, and disruption.
Five regression families were trained on 33 objective sleep parameters under 5-fold cross-validation, with predictive performance reported as R².
Bars left of zero indicate performance worse than predicting the sample mean.
The best-performing model explained less than 2% of the variance in caregiver-reported sleep adequacy — substantively similar to naively predicting the sample mean for every participant.
A discordance score was computed for each participant by subtracting the normalized caregiver adequacy rating from a standardized composite of objective sleep quality. Positive values indicate the wearable composite exceeded the caregiver's assessment.
In 143 of 179 participants (mean discordance = 0.185, SD = 0.274), the device's composite landed above the caregiver's rating — the wearable was systematically more optimistic about sleep than the person observing it.
For each Fitbit metric, I picked the single caregiver question it’s supposed to line up with — sleep efficiency with the caregiver’s read on efficiency, REM percentage with feeling refreshed, and so on across eight pairs. After accounting for the fact that running eight tests at once inflates false positives, none of them held up. The one pair that even reached the conventional bar before correction pointed the wrong way — adolescents who slept longer were rated as less rested, possibly because extra sleep is compensating for poor quality rather than reflecting it.
0 of 8 pairings significant after BH correction.
One predictor consistently explained when wearable and caregiver assessments diverged: social jetlag — the absolute difference between school-night and free-night sleep midpoints.
N = 179 · p = 0.0002 · adjusted for chronotype, demographics, and objective sleep quality
Adolescents with greater circadian misalignment received lower caregiver adequacy ratings even when their objective sleep architecture appeared intact — suggesting caregivers are sensitive to the rhythm and regularity of sleep, dimensions the wearable composite does not encode.
Subjective sleep adequacy, in this sample, is not reducible to physiological sleep architecture. It is jointly determined by sleep biology and the temporal context surrounding it — when a person sleeps, how regularly, how the week is shaped. Wearable devices instrument the first dimension well. They do not instrument the second.
The methodological implication is that consumer sleep scores answer a substantially narrower question than the one users are typically asking when they consult them. The intervention implication is that interventions targeted at consumer sleep technology, if they aim at felt sleep quality, will likely need to address circadian regularity and sleep timing alongside sleep architecture.