I’ve done enough “real”
VO2 max tests—the trips to the lab, the expensive
and cumbersome equipment, the brutally exhausting treadmill
protocol, and in one case, the puking in the corner afterwards—that
I’ve always been intrigued by GPS watches and heart-rate
monitors promising to estimate my VO2 max for me. These
days, tons of running watches will give a measurement
of VO2 max, from the Garmin Forerunner 55 and Garmin Enduro
to the new Suunto 9. So could it possibly be that simple?
I’m clearly not the only one wondering,
because I noticed at least four presentations at the 2017
American College of Sports Medicine conference addressing
that very question. Overall, the results look better than
I might have expected, but there are some differences
between the various approaches taken by different watches.
What is VO2 max
and how do you estimate it?
VO2 max is basically the definitive measurement of aerobic
fitness. It tells you the maximum rate at which you can
take oxygen from the air and deliver through it through
the lungs into the bloodstream for use by your working
muscles. It’s an excellent measurement of current
health and predictor of future health. In fact, American
Heart Association has argued that it should be classified
as a new “vital sign” to be assessed yearly
by your doctor.
As the AHA statement noted, there are
various ways of measuring or estimating VO2 max. The best
is to do it directly, by measuring the oxygen you consume
while you exercise to exhaustion. Next best is to estimate
it while exercising to exhaustion; for example, based
on the distance covered during a 12-minute run.
There are also “sub-maximal”
exercise protocols that estimate VO2 max based on the
relationship between your heart rate and pace, without
forcing you to go all-out. This (along with basic information
like your age and sex) is what GPS watches like the Garmin
Forerunner 230, 235, and 630 do, by having you run for
at least 10 minutes while simultaneously measuring your
pace and heart rate. (The 230 and 630 measure heart rate
with a chest strap, while the 235 uses a wrist sensor
integrated into the watch.)
Finally, there are estimates that don’t
involve exercise at all, but simply use information like
your age, resting heart rate, and typical activity levels.
Watches like the Polar V800 take this a step further by
measuring your heart-rate variability (the subtle variations
in the time between successive heart beats) for a few
minutes while you’re lying down.
Here’s what the data presented at
the ACSM conference found.
Garmin vs. Polar
vs. lab tests for VO2 max
The most comprehensive study came from Bryan Smith and
his colleagues at Southern Illinois University Edwardsville.
They estimated VO2 max for 23 women and 26 men using the
Garmin 230, Garmin 235, and Polar V800, then compared
those results to gold-standard lab testing.
Typical VO2 max values in healthy college
students tends to be in the 40s or 50s (in units of milliliters
of oxygen per kilograms of body mass per minute). In those
units, here’s how much the various watch estimates
over- or underestimated VO2 max (an upward bar indicates
that the watch underestimated VO2 max):
There are some interesting
patterns there. The Garmin measurements seem to consistently
overestimate VO2 max, to a greater degree in men than
women, and to a greater degree with the wrist sensor (which
is a newer and less reliable way of monitoring heart rate)
than the chest strap.
The Polar measurements appear to be less
accurate—not surprisingly, given that they’re
estimating a characteristic of maximal exercise while
at rest. But the deviation seems to be completely different
in men and women. It’s hard to know whether this
is an artifact of the particular group of men and women
in this study (the men in the study had slightly higher
BMIs and also slightly higher VO2 max), or something more
systematic.
Before taking these results as gospel,
though, it’s worth checking what some of the other
studies found.
Chest strap vs. wrist sensor
Another analysis from the same group took a deeper head-to-head
look at the data from the two Garmin systems. Given that
the measurements were with both watches at the same time
on the same run, what explains the different VO2 max estimates?
The most likely culprit appears to be
the heart-rate measurements. Chest straps are considered
highly accurate, and the wrist sensors produced heart
rate values that were consistently lower than the chest
strap. That, in turn, meant that the wrist sensors overestimated
VO2 max—which makes sense, since the heart-rate
data was artificially low.
The conclusion, not surprisingly, is that
chest straps give you better data. The remaining question
is whether the wrist band gives you “good enough”
data, which depends on what you’re using it for.
A couple of other studies compared a single
device to lab measurements.
Garmin vs. lab
tests
Rebecca Moore and her colleagues at Eastern Michigan,
led by grad student Andrew Pearson, used a treadmill test
and a Garmin Forerunner 235 (the one with the wrist sensor)
to measure VO2 max in 23 volunteers.
In this case, the average VO2 max in the
lab was 52.4 ml/kg/min, compared to 49.3 ml/kg/min with
the watch—so the watch with the wrist sensor underestimated
the lab value, which is the opposite of what the Southern
Illinois study found.
What explains the discrepancy? I have
no idea, but it suggests we should be cautious about drawing
definitive conclusions about either set of results. I
asked Moore and Pearson about the individual variation
in their data, and they said that the watch consistently
underestimated VO2 max, particularly for those with higher
values (above 50, a value generally found in sub-20:00
5K runners).
Polar vs. lab tests
Finally, Kent Johnson and Jenny Beadle of Lipscomb University
compared the values produced by Polar’s FT60 Fitness
Test (the one based on heart-rate variability while lying
down) with lab values in 31 subjects. In this case, the
average lab value was 44.9 ml/kg/min, and the Polar value
was 49.8 ml/kg/min.
This overestimate of about 10 percent,
or just under 5 ml/kg/min, is similar to what the Southern
Illinois group saw in men, but not in women. However,
the Lipscomb volunteers were 13 men and 18 women, so that
pattern of sex differences doesn’t seem consistent
between the studies.
What overall conclusions
can we draw from these VO2 max studies?
First, there appears to be a general hierarchy along exactly
the lines that you would have guessed before seeing the
studies. An exercise-based test with a chest strap is
better than one with a wrist sensor, which in turn is
better than a resting test.
None of them are perfect matches for maximal
lab testing, but the chest strap data seems remarkably
good, with statistically insignificant overestimates of
0.8 and 1.2 ml/kg/min, on average, in women and men. That’s
a little bit more than 2 percent.
Second, given the inconsistencies between
different studies, we shouldn’t draw any final conclusions,
particularly about patterns like how men and women respond
differently. Taken as a whole, the studies suggest that
the Garmin methodology can give you a VO2 max estimate
within about 5 percent of your true value.
To be really useful from a practical perspective,
what we’d need to understand is how consistent the
measures are when repeated multiple times, and how much
individual variation there is. Knowing that the watches
are off by less than, say, 5 percent on average is nice—but
does that mean nearly everyone is off by between 3 and
5 percent, or are a few people right on while others are
10 percent off?
In the end, you pretty much get what you
pay for (in terms of money and effort). For most of us,
an estimate of VO2 max is interesting for curiosity’s
sake, and an error of a few percent is no big deal. If
you want a more accurate fitness marker, head to your
local exercise physiology lab... or, better yet, sign
up for a race.
|