the little oblivion — on taking off the sleep ring

The other afternoon I fell asleep for about fourteen minutes. I know it was about fourteen minutes because I looked at the clock before and after, and because there is a soft, blank little hole in my memory where roughly ten of those minutes used to be — a nice little oblivion. My ring, meanwhile, had a different and very confident opinion about what my body had just been up to. We disagreed. After a few months of this, I have started to notice that when the ring and I disagree, I am usually right, and that it takes the ring a surprising amount of silicon to be wrong.

This is a piece about a small, beautifully made titanium ring that reads my pulse while I sleep; about what such a device can genuinely measure and what it merely guesses and prints in a confident font; and about the slightly larger question of whether outsourcing the question “how do you feel?” to a battery on your finger is good for a person. I built a small lab to find out. I read the validation papers. I logged what actually happened against what the ring claimed happened. Some of what I found is genuinely impressive. Some of it is the device telling me, in effect, that a joyful walk was a panic attack. Spoiler, gently telegraphed by the title: the ring is coming off. But I want to earn that ending rather than just assert it, so let us start with the physics.

1. What the ring actually is

Strip away the app, the scores, the gentle push-notification nudges, and the ring is three sensors. It shines green and infrared light into the blood vessels of your finger and watches how the reflection flickers with each heartbeat (photoplethysmography). It has a little thermometer against your skin. And it has an accelerometer that feels motion. That is the whole instrument: light bouncing off blood, skin temperature, and movement. Everything else — sleep stages, “readiness,” “daytime stress,” the dignified prose in the morning — is inferred from those three channels by software.

Crucially, there is no electroencephalogram. The instruments that actually define sleep — electrodes reading the brain’s electrical activity and the eyes’ movements — are not present and cannot be present on a finger. So when the ring says “you got 1 h 12 m of deep sleep,” it has not observed deep sleep. It has observed a low, flat heart rate and a still hand, and made an educated guess. Often a good guess! But a guess, downstream of physics, dressed as a measurement.

It is worth saying plainly, because the apps work hard to obscure it, that even “raw” ring data is already heavily cooked. The light sensor samples hundreds of times a second; none of that survives. What you can actually retrieve is a heart rate roughly every five minutes, one HRV number for the whole night, a daily temperature deviation, and motion sorted into a handful of classes per half-minute. The trustworthiness of a number depends entirely on where it sits in this chain:

measured   heart rate, skin temperature, motion        — close to fact
inferred   sleep stages, breathing rate, “stress”     — educated guesses
scored     readiness, sleep score, “engagement”     — opinions in a trench coat

Keep that ladder in mind. Almost every quarrel I had with the ring was really a quarrel about a number from the bottom two rungs being presented with the swagger of the top one.

2. Sleep it cannot see

Return to my fourteen-minute oblivion. To a device whose only witnesses are heart rate and motion, “asleep” and “lying still, awake, relaxed” are the same signal: not much movement, a lowish pulse. There is no channel that distinguishes them, because the channel that would — the brain — is not wired up. This is not an Oura flaw; it is a fact about fingers.

You can watch it fail. Over a few weeks I logged what I was really doing whenever the ring announced a nap. The pattern was almost comically consistent: the ring’s nap detector is driven by stillness, with heart rate barely a tiebreaker.

What I was actually doing	Truth	pulse	Ring’s verdict
Talking through a whole film on the sofa	wide awake	~57	“nap”
In the cinema, before the lights went down	wide awake	~67	“nap”
Lying still, trying and failing to fall asleep	awake, frustrated	low	“nap”
Actually asleep, the oblivion afternoon	asleep	~63	“nap” (correct, by luck)

The cinema row is the tell. My pulse there was about 67 — not low at all — yet a dark room and a motionless hand were enough to trip “nap.” Stillness, not sleep, is what the ring is really detecting. So its reliable failure mode is any situation that is dark, calm, and still but emphatically awake: cinemas, theatres, meditation, a long dull meeting, lying in bed losing the fight with insomnia. The ring files them all under sleep.

And the stages inside that sleep? The best independent validation study, which put the ring against laboratory polysomnography across some four hundred thousand thirty-second epochs, found that the ring’s “light sleep” bucket quietly absorbs about a third of what the lab scored as deep sleep and about a quarter of what it scored as REM.¹ Single-night stage percentages are off by enough that the tidy pie chart in the app is, at the resolution people actually read it, decorative. It looks precise. Precision and accuracy are not the same virtue.

3. Stress it cannot read — and a joyful walk filed as panic

Here is where the ring stops being merely limited and starts being a small daily gaslighter. The “daytime stress” metric measures the magnitude of sympathetic arousal — heart rate above your personal baseline, low motion, a warm skin — and then labels it stress. But your heart does not know why it is beating faster. Digestion, a flight of stairs, a strong coffee, the tail of a workout, a thrilling idea, and genuine dread all raise the same pulse. The ring sees the elevation and guesses the cause, and its guess is reliably “stress.” Across my logged days it ran, on average, about twenty minutes a day higher on “stress” than a careful reading of my own raw heart-rate trace could justify.

The funniest case — funny in the way that is also a little insulting — was a brisk, delighted walk through a crowded book fair. Pulse up around 120; mood, in my own contemporaneous notes, “tons of good adrenaline.” Unambiguous, knowable, positive excitement. The ring logged the afternoon as roughly two hours of high stress. A genuinely joyful hour, recorded as suffering, because the only thing the sensor can see — elevation — is exactly the thing that excitement and distress have in common.

There’s a final irony built into the metric. The one moment on that walk I would genuinely have called stressful was excluded from the stress count, because my heart and my legs were both working, so the algorithm filed it as exercise. The metric misses the real stress and invents fake stress, in the same afternoon, by construction.

4. Trends it invents

The morning paragraph — the friendly auto-written coaching note — is the bottom rung of the ladder, and it shows. One morning mine informed me, gravely, that my HRV had “reduced steadily over the past two weeks, from around 20.7 to about 16.9,” and suggested I lighten my training. Two problems. First, I don’t train; I walk. Second, and worse, the trend was not real. Here are the actual sixteen nightly numbers it was describing:

Sixteen nights of nightly HRV (solid): noisy, flat, median around 18–19, already recovered from a dip in the middle. The dashed line is the “steady decline from 20.7 to 16.9” the ring narrated — a lagging seven-day average chasing a dip I’d already climbed out of, drawn as if it were the present. A rear-view mirror, reported as the windscreen.

The smoother the ring draws, the more it manufactures direction out of noise. And this is the general disease of the category. An independent review of fourteen of these “readiness / recovery / strain” scores across ten brands found not one with a disclosed algorithm or any external validation against a real outcome — and a structural tendency to double-count: a poor night lowers both your sleep score and your HRV, which are causally the same event, so the composite gets penalised twice and the number lurches harder than reality.² A single confident integer built from secretly redundant parts. It is astrology with a firmware update.

5. In fairness: what it does beautifully

I promised to earn the ending, and that means admitting the ring is, at its actual job, excellent. The nocturnal measurements — the ones near the top of the ladder — are real physics, measured well, and they catch things my own waking self cheerfully lies about. The best example is alcohol.

After two glasses of wine I slept, by my own morning report, “pretty well.” My average HRV for the night agreed: a perfectly normal 20-ish, indistinguishable from a clean night’s 21. But the average is a liar of omission. Plot the HRV across the night and the wine is written all over it:

wine night vs a clean night, HRV (rMSSD) across the hours after I fell asleep. Near-identical nightly averages (20 vs 21 ms); utterly different shapes. The wine night sits suppressed for the first five hours while my body works through the alcohol, then rebounds hard and late as it finally clears. The mean averages those two halves into a shrug.

The shape diagnoses it. Confirm with a control: a single light beer, finished early enough to clear before bed, leaves no fingerprint at all — the night tracks the clean one flat across:

a one-beer night that had cleared by bedtime vs the same clean reference. No early suppression, no late rebound — the two nights are the same night. The ring isn’t reacting to the idea of a drink; it’s reacting to alcohol still in the blood at lights-out.

I pre-registered that beer prediction before looking, which is the part that turns a nice plot into evidence. There is even a clean rule hiding in it: the within-night HRV trajectory is the smoke alarm — something perturbed the first half of the night — and the shape of the heart rate, plus a one-line log of what I did, is the diagnosis. Alcohol still clearing makes the pulse peak in the middle of the night; a late workout makes it run hot early and fade; a short, late night just runs hot throughout. (That last one is why I am not showing it off as a third chart: early-night HRV suppression is real but non-specific, and honesty about a signal includes honesty about what it can’t tell apart.)

This day-to-day swing in nightly HRV is itself a validated, behaviour-sensitive measure — its coefficient of variation over a week,

\( \mathrm{HRV\text{-}CV} \;=\; 100 \times \dfrac{\mathrm{SD}\,(\text{nightly rMSSD})}{\overline{\text{rMSSD}}}\,, \qquad \text{reliable only with } \ge 5 \text{ nights in the window} \)

and in a study of some twenty thousand people each extra daily drink pushes it up by about two and a half points.³ So the ring really can read my drinking off my sleeping heart. Which is exactly why the limitations sting: the instrument is good enough that its overconfident guesses borrow credibility from its excellent measurements. The same little screen shows you a real alcohol signature and a fictional stress trend in the same calm typeface, and never tells you which is which.

6. The mirror: my own code had the same bug

Before this turns into a tidy morality tale about a corporation’s dishonest software and my honest skepticism, I owe you a confession. I wrote my own little nap detector — the supposedly trustworthy, first-principles one — and it had a bug that was a perfect mirror of the ring’s worst habit.

My detector looked for long, quiet, low-flat stretches of heart rate. The oblivion nap was exactly that — and my code missed it completely, reporting an unrelated quiet block from hours earlier instead. The reason was almost poetic. The nap ended at a gap in the data (the ring dropped a sample, then another, for seventeen minutes), and my function only “closed and scored” a quiet stretch when the next sample disqualified it. A stretch that ended at a dropout was silently thrown away. So:

Faced with a real nap that ends at a sensor dropout, the ring hallucinates a nap at the dropout, and my detector deleted a real one because of it. Same blind spot, opposite sign.

I fixed mine in four lines. But the lesson outlived the fix: this isn’t a story about a bad brand and a clever critic. It’s a story about a class of instrument — pulse plus motion, no brain — running into the edges of what pulse plus motion can mean. My code and Oura’s code fell off the same cliff in opposite directions. That is humbling, and it is the most honest thing in this whole piece.

7. Why I’m taking it off

So here is the ledger. The ring is a superb nocturnal heart-and-temperature instrument that makes educated guesses about sleep and confident fictions about stress, and stacks all three in one font. I could keep it and simply read it well — trust the top of the ladder, ignore the bottom. I’ve certainly built the tooling to. But I find I don’t want to, and the reason is not really about accuracy.

It’s that I don’t like what the daily number does to my attention. There is a particular kind of detachment that creeps in when you delegate “how did you sleep?” and “how do you feel?” to a battery. You wake up, and before your body gets a word in, you check. The number frames the day. On a morning I felt fine, a mediocre score could install a faint, suggestible tiredness; on a morning I felt rough, a good score could talk me out of resting. Either way the instrument’s opinion arrives before my own, and over months I could feel my own interoception — the quiet skill of just knowing how I am — getting a little lazier, a little more deferential. A device sold to increase self-knowledge was, in my n-of-1, gently eroding the thing self-knowledge is actually made of.

And there’s a context I’ll mention without dramatising it: I have a flavour of anaemia, which parks one or two of my numbers permanently outside the textbook bands. The ring doesn’t know that, can’t know that, and reads me against a population I am not. The fix is to read against my own baseline — which is just a long way of saying the authority was always going to have to be me. Once the authority is me, the ring is, at best, a sometimes-useful witness whose testimony I have to cross-examine every single morning. That is a lot of cross-examination to wear on one finger.

I want to be fair to my future self, who is allowed to change his mind: there are real uses here. If I ever want to know, objectively, whether the wine is wrecking my nights, I will put it back on for two weeks and read the trajectory, not the score. As a scientific instrument, used deliberately, in bursts, for a specific question, it is genuinely good. As an ambient oracle, consulted every morning before I’ve had the dignity of forming my own impression, it is subtly bad for me. The mistake was wearing the scientific instrument as if it were a wiser, more honest version of my own attention.

It is not. It is light bouncing off blood. So, for now — and only for now — I’m taking it off, and going back to the older, lower-resolution, surprisingly reliable instrument of asking myself how I feel and waiting for the answer. This afternoon it told me I’d had a lovely fourteen-minute nap. The ring agreed, for once. But I didn’t need it to. I had the oblivion. That was the data.

Notes & sources

Svensson et al. (2024). Validity and reliability of the Oura Ring Gen3 against ambulatory polysomnography, 96 participants, ~421,000 epochs. Sleep Medicine 115:251–263. — the source for the light-sleep contamination of true deep/REM, and the small REM and efficiency biases.
Doherty et al. (2025). An evaluation of composite health scores (readiness / recovery / strain) in consumer wearables, 14 scores across 10 brands. Translational Exercise Biomedicine 2(2):128–144. — zero disclosed algorithms, zero outcome validation, and the multicollinearity double-penalty.
Grosicki et al. (2026). Nightly HRV coefficient of variation as a behaviour-sensitive digital biomarker, ~21,600 people / ~2M nights. Am. J. Physiol. Heart Circ. Physiol. 330:H187–H199. — the ≥5-nights reliability floor and the ~+2.5-point-per-daily-drink effect; also: low variability is not automatically good.
On arousal without valence: Schachter & Singer (1962), cognitive-physiological determinants of emotion; Dutton & Aron (1974), misattribution of arousal; Jamieson and Brooks, on reappraising anxiety as excitement. — the science behind “one arousal, two labels.”