Wednesday, July 01, 2009

The Sins of Sins (Testing FTP #2)

One of the most frequently referenced items on this good ol’ blog of mine (I can call it old, it’s in its fourth year – that’s officially old in interwebby speak) was an item I penned about ways to estimate your Functional Threshold Power (FTP - maximal quasi steady state average power one can sustain for about an hour).

That post was basically an expansion on the original information provided by Dr Andrew Coggan, publicly posted many years ago on the Wattage Forum and dubbed “The Seven Deadly Sins”.

Indeed since writing it, this one blog item has been viewed nearly 45,000 times.
Here is the link to the original post:
The Seven Deadly Sins

It was recently suggested to me (by Steve Palladino) that it might be worthwhile to pen a follow up to that post. One that explores some of the common mistakes people make when attempting to estimate their FTP. So here are a few thoughts on the subject.

As is often the case, none of this is particularly original, most of these are just accumulated tidbits of information and knowledge and it is by no means an exhaustive list. I may even have some of it wrong. You may have others worth adding or corrections – by all means, let me know – happy to add them to the examples listed.

Before getting into the list – The Sins of Sins – I will say that estimating FTP is important and the reasons for that are outlined in my previously linked post. It's not important in a “curing cancer” kind of way, but getting it right to at least a reasonable level of accuracy is pretty darn handy as there are many other very useful facets of training and racing with power that rely on having a good FTP estimate.

One doesn’t need to be completely anal about it and testing really often is not typically necessary (a few times a year is usually enough – the appropriate frequency depends on individual circumstances). Also nailing it down to the watt is not necessary either, the nearest five watts is typically more than sufficient.


The Sins of Sins – Top 10 (in no specific order):
SOS #1 – Not testing at all
SOS #2 – Not using an accurate power meter
SOS #3 – Using inconsistent methodologies
SOS #4 – Not replicating riding conditions in testing
SOS #5 – Ignoring signs that FTP has changed
SOS #6 – 95% of a 20-min mean maximal power = FTP
SOS #7 – Using NP from rides < < 1-hour
SOS #8 – Inappropriate use of the CP model
SOS #9 – Not performing maximal efforts
SOS #10 – “I’ve got an NP buster!”


OK, let’s examine each in a little more detail...

Sin of Sins #1 – Not testing at all
OK, this might seem a bit redundant, but honestly there are people who think they can get away with no testing at all but still want to know what their FTP is. Or that testing is such an impost in the training / racing schedule that it is “harmful” to schedule it. Bollocks.

Given the adage “training is testing, testing is training” then really there’s no excuse for never doing an effort or two in order to nail down one’s FTP more tightly than a lame guess. Stop wondering and go and do it. Gee, I feel better already.

Of course, an experienced eye can often inspect the mass of an individual’s power meter data and probably come up with a reasonable SWAG. But far better to schedule a test and be certain.


Sin of Sins #2 – Not using an accurate power meter
(and/or not using a power meter at all)

This is also a pretty obvious sin of sins but it happens. If you are going to use a power meter, it makes a lot of sense to ensure you are collecting accurate data. Otherwise how are you going to be sure that changes in power output as reported are in fact representative of actual changes in performance?

Check your meter’s calibration and make sure you perform the appropriate torque zero / zero offset procedure so that the data can be considered reliable. Neither is hard to do nor time consuming.

And if you don’t have a power meter, sure, go time yourself up a long steep hill climb and make an estimate of power output using analyticcycling.com, but then what? Without a reliable means to collect power data at other times, then the primary benefits of knowing your FTP and all that flows from it are not accessible. So use the hill climb as a good fitness test but the power estimate is essentially for satisfying curiosity or bragging rights at the coffee shop.


Sin of Sins #3 – Using inconsistent methodologies
This is pretty common. When you start out with a power meter, naturally you’ll want to work out the best, most reliable method for your particular circumstances. Everyone has different terrain to ride on, levels of traffic to contend with, opportunities to do a time trial, or time in which they can safely perform a test where they live, or can’t get outside for months on end, etc etc, so the sin(s) they choose to use as most appropriate to estimate FTP are different.

But once you have settled on a good method, then stick with it and replicate the same protocol each time. By reducing the number of variables that can influence the outcome, the more reliable is the data and what can be interpreted from it.

Examples of consistency might include:
- Using the same venue
- Using the same number of light, recovery ride or rest day(s) before the test(s)
- Performing tests in the same order, with the same break in between
- Performing the tests on the same number of days apart (or always on the same day)
- Using the same equipment
- Looking for similar environmental conditions if possible
- Performing tests over the same distance/duration

Of course it is not always easy or practical to replicate everything, every time, but at least consider these factors when deciding on a test method. Some methods lend themselves to more consistent protocol than others. A time trial over the same course, or undertaking a Maximal Aerobic Power test are examples of those which enable consistency without too much thinking involved.


Sin of Sins #4 – Not replicating riding conditions in testing
This might not be as bad as it can seem at first but it makes sense to at least use a test method using the bike/equipment/terrain/location/bike position etc that comprises the majority of your riding at that stage of your training/season.

This is especially the case when there is likely to be a significant difference in the performance (power) using the test method versus what you would ordinarily be able to produce. For example, if you only ride indoors occasionally and know you struggle to generate the same power as you typically do outdoors, then don’t use the indoor trainer to test FTP.


Sin of Sins #5 – Ignoring signs that FTP has changed
“I had a two hour group run today and my Intensity Factor was 1.07”.
Provided you are not falling for SOS #1 or SOS #2, then be on the lookout for signs that FTP may indeed have shifted significantly. There are a number of them and they include:

- Actual performance not consistent with current FTP estimate, such as AP/NP from a 40km TT that is significantly different from FTP

- An Intensity Factor (IF) > 1.05 for any ride or section of a ride of about an hour

- Regular long intervals at/near FTP becoming “easy(ish)”

- Perceived exertion for rides not consistent with intended level (e.g. a tempo power rides feels more like an endurance ride)

- a steeper than typically sustainable medium term rise in Chronic Training Load. e.g. your CTL has apprently risen at a much higher rate than you would normally expect to sustain without getting ill/niggles/overly fatigued (e.g. > 8 TSS/day/week but maybe less for some)

Now these are signs that FTP may need retesting but are not necessarily good tests in themselves. So ignore them at your peril but don’t jump to inappropriate conclusions or immediately adjust FTP. Gather some additional evidence.


Sin of Sins #6 – 95% of a 20-min mean maximal power = FTP
Well, this method of establishing FTP isn’t one of the listed Seven Deadly Sins in the first place, but it has become such a commonly referred to/utilised method (mainly due to its publication in the excellent book, Training and Racing with a Power Meter) that it gets its own SOS number.

Firstly, the main issue with this common Sin of Sins is that the ratio between 20-min power (or other similar shorter TT duration power) and FTP is not the same for everybody, and neither does the ratio remain static for an individual. One should recognise that due to several factors, not least of which is the contribution of anaerobic capacity and the exact protocol used (e.g. performing a pre-ride blowout effort), that the ratio is likely to be within a range and where someone is within that range is anyone’s guess.

So, FTP might be anywhere in the range of, say 90% to 98% of 20-min max average power. Personally, my FTP has been at both 92% and 96% of my then 20-min max average power. So, by all means use 95% of 20-min max power as a starting point but remember it may well be out by some margin and it would be wise to use an additional or alternative method to validate your FTP estimate.


Sin of Sins #7 – Using NP from rides < < 1-hour
“My 20-min max NP from that crit was 378 watts, so is my FTP 95% of that, i.e. 359 watts?”

Er, no.

Apart from falling for SOS #6, the efficacy of the Normalised Power algorithm in providing a “normalised iso-power equivalent” begins to drop somewhat as the duration shortens to substantially less than one hour. 20-minutes is in that grey zone. 30-minutes ain’t too shabby but I think anything less than 40-50 minutes is stretching the envelope a bit much for a reliable number from which to make an estimate of FTP.


Sin of Sins #8 – Inappropriate use of the CP model
The Critical Power (CP) model is a useful way to estimate FTP. See my previously linked item on the Seven Deadly Sins to find out a bit more on how it works.

The calculation of CP is sensitive to both the way data is collected and the data chosen to input into the model. So ignoring reasons for these sensitivities can introduce unwanted errors. Common SOS#8 mistakes are:

- Using data from inappropriate test durations. Ideally you will want data from within a range of durations – typically tests should be at least 3 minutes and no longer than 30 minutes duration. Tests from very short (e.g 1-minute) or long durations (e.g. 60-min) tend to skew the calculations somewhat. Besides, if you have a 60-min test, then CP is somewhat redundant.

- Using data from test durations that are too close to each other, e.g. 3-min and 6-min. It is far better to use one test of ~ 3-6 min and one of ~ 20-30-min. Can also include another from a duration in between but two really good points with sufficient spread between them is all that's really needed.

- Using multiple data points which include unreliable data, such as a test that was not truly a maximal effort for the duration or was tainted due to the protocol/method used to collect the data. Far better to have two very good data points than four data points with one or two suspect numbers.

- Not using the same test durations each time. E.g. using a 6-min and a 20-min test and next time using a 3-min and 28-min test. Pick your sample durations and stick with them, within reason. This is not as easy as it seems, since if you are doing a 5-min test, how hard do you go? It can be easier to pick a power level you expect to maintain for the duration and go ’til you blow. But if it becomes a significantly different duration, it may affect the outcome.

- Using a different protocol to collect the data. Principles of SOS #3 apply. If you perform both, say a 5-min and a 25-min test on the same day, then next time do it the same way and in the same order. If you perform the tests on different days, then be consistent about that protocol.

- Similarly, avoid cherry picking mean maximal power data from different rides, e.g. a local TT and last week’s crit and then next time a Level 4 training effort and the hillclimb during the local world’s bunch ride.

- Selecting non-contemporaneous data. Now that’s a big word. What I mean is, you don’t select your best 5-min power from three months ago and combine it with a 25-min test from last week. The data must be from the same time period (I suggest the limit for data collection be approximately one ATL time constant or around 7-10 days)

- Using Normalised Power. Don't. Use Average Power.

- Not weighing yourself or using the wrong body mass for the model (note that this doesn't affect CP calculations, just some versions of the model also quote or calculate CP in W/kg terms).

Note that the CP value calculated by the model is typically a better estimate of FTP than the 60-min power predicted by the model. The 60-min power prediction is usually a bit higher than the CP value.

Note added June 2013:
The Golden Cheetah power meter analysis software has a built in feature that uses the principles of the critical power model to provide a CP estimate based on your power meter files. I am not exactly sure of the means by which GC's implementation derives its estimate, but I suspect it is susceptible to the problem of cherry picking data, using inconsistent data, and possibly not including data from efforts of sufficient duration as mentioned above.

As a result, use of the CP model implemented in this manner routinely overestimates FTP. Initial data as assessed by Dr Coggan indicates a typical overestimation of around 5%. This presumes there is sufficient actual data with maximal efforts across various durations.


Sin of Sins #9 – Not performing maximal efforts
Testing performance requires one to go to the limit, otherwise one can never know where that limit is. There is some sub-maximal testing one can do, such as determining lactate threshold in the lab but for the purposes of using a power meter to ascertain FTP, then one does need to lay it all on the line.

Of course it goes without saying that one should be sufficiently fit and healthy to perform maximal effort testing. Undergoing testing while health concerns exist may well end up being the biggest mistake of all!


Sin of Sins #10 – "I’ve got an NP buster!"
No you don’t*.
It is 99.99% likely that:
(i) your FTP is underestimated, or
(ii) the duration you are referring to is not about an hour, or
(iii) your power meter data is suspect – reference SOS #2.

* OK it is possible, just highly improbable and some substantive evidence is required before making such a declaration and joining this rare club.

Finally, there’s not much point in taking your track bike to the local velodrome, doing a whole bunch of anaerobic efforts while tooling around the infield in between efforts, racking up some weirdo NP number due to all the breaks and then seeking to use it as guide to FTP. The test needs to be realistic for the purpose. This is a variant of SOS #4.

I’d expand some more on this, like “what the %&%$ is an NP buster?” and “I do so have an NP buster” but perhaps I’ll save that for another day.


OK, that’s enough for today. It was a bit long but hopefully it can help you to avoid some of the more common pitfalls when attempting to estimate your FTP. It's not all that hard.

Good luck and safe riding!

Read More......