Those who followed our coverage of the Tour de France will recall the in-depth discussion around the analysis of the climb of the Verbier, which was ignited when a French scientist, Antoine Vayer, estimated Alberto Contador's power output to be around 445W, and then projected that Contador would have to have a VO2max of about 99ml/kg/min in order to achieve that performance. I must just add that given that Contador was about a minute ahead of maybe 8 or 9 riders that day, pretty much all of them would have had to have impossibly high VO2max values, not just him.
This projection is made on the basis that a rider producing a power output X is actually consuming energy Y (depending on efficiency), and we can calculate roughly how much oxygen is required to produce that energy. It requires a few assumptions, of course, but is an important principle, which is what I'd like to pick up on in today's post. (Just as an aside, Frederic Portoleau, a colleague of Vayer's has since taken the time to compare Vayer's method of estimating VO2max to the actual, measured data from Nicki Sorensen's SRM. The result? The actual SRM value was 357W, Vayer's estimation was 365W, which is only 2.5% higher. So the response by many of dismissing as ludicrous Vayer's estimations (and hence the implications) seems premature. It was high, and certain other assumptions need to be looked at to confirm the validity of the predicted VO2max, but it seems estimation is not "ridiculous" as was suggested)
Performance analysis in the fight against doping - two categories
This kind of analysis forms part of what is now being proposed as a potential quiver in the bow in the battle against doping. The premise is pretty simple: Physiology sets the limits for performance, and for how performance changes over time. That is, every single performance is underscored by a set of measurable physiological determinants, and so there are two categories of performance analysis that can be used to "flag" suspicious performances:
1. Detecting performances lacking physiological "credibility"
2. Historical analysis to detect the rate of performance change in individuals
To give you an absurd example of the first category - if you measure my VO2max as 65 ml/kg/min, and my oxygen consumption as 60ml/kg/min while running at a speed of 6 minutes per mile (3:43/km), then there is absolutely zero chance that I can run competitively against world class athletes who race at 3:00/km for a marathon. Why not? My physiology is inferior, and for me to run 3:00/km will push me well above a maximum exercise intensity, and I will be unable to hold that intensity for the required 2 hours.
If I was to achieve this performance, say three months later, it could be flagged as lacking credibility, and you would have to ask how it was possible? The answer is that either:
- I have discovered some other way to improve performance beyond what your physiological measurements predicted (that is, doping), or;
- Three months of training have seen me improve my ability to the point where my VO2max is now 80ml/kg/min, and I'm using 60ml/kg/min running at 3:10/km. If you put me in the lab again, I'd produce these numbers and you would say my performance is credible, apart from the fact that I've achieved such enormous gains in so short a time. This kind of improvement, in such a short time, would be a strong indicator of doping (note that it's not a guarantee)
Performance tracking - rates of change
The other part is to examine how performance changes over time. Once again, the premise is the same - physiology sets the limits for how rapidly people improve. That is not to say that everyone should improve at the same rate - please, don't read this and shout out "discrimination" against those "outliers" who produce brilliant performances, seemingly from nowhere. That happens, yes, but if you are sensible about how you track performance over time, then you can work out ways to minimize the chance of these once-off athletes affecting your result.
For example, you might look at the best performance, AND the average of the next ten or twenty performances. By taking an average, you are trying to manage the impact of one individual on your ability to interpret the data. If the same trend exists for the top 20, then you have a much stronger reason to suggest that something else is in play.
But, let's not speak in metaphors here - below are three graphs that do exactly this, and then you'll see how the principle might be applied. These three are redrawn from a paper by Prof Yorck Olaf Schumacher, one of the leading anti-doping experts in sport, and a man who has worked extensively with Olympic athletes, and now anti-doping agencies. The paper, "Performance Profiling: A Role for Sport Science in the Fight Against Doping?" was published earlier this year in the International Journal of Sports Physiology and Performance. If you'd like a copy, as always, just let us know!
Women's discus - introduction of out-of-competition testing
First of all, take a look at the best performance (red line) and the average of the top 20 performances (blue line) in the women's discus event since 1960.
It should be immediately obvious that between 1960 and the late-1980s, the event was in a state of "lift-off". Not only was the best performance improving almost every year, but the average of the best 20 performances was going the same way. Then, in 1988, out-of-competition doping controls were introduced, and so the use of steroids may have declined thereafter, explaining why the event today is on par with where it was in the early 1980s - it has gone backwards, performance wise, and many will say it is now closer to where it should be physiologically. This graph gives you a striking illustration of how doping, and its partial removal (presumably) affect the "limitations" to performance.
Men's distance running
Next, look at the best time and average of the best 20 times for the men's 5,000m and 10,000m events:
I don't think I have to point out the striking change in performance, particularly in the 5,000m event, after the commercial introduction of EPO in about 1990. I'm particularly interested in how the average of the top 20 times each year changes, because the red line, which represents the best performance, and thus only one athlete, might be misleading. But the blue line, that average, very definitely heads downwards, after a period where it had begun to level off. For the top 20 athletes to all improve in a season is suggestive of a systemic change, possibly in training, possibly nutrition, possibly equipment (imagine what swimming's graphs will look like one day!), possibly increased exposure of athletes. Or, quite possibly, doping, and the co-incidental timing of EPO becoming commercially available and this drop-off is quite difficult to ignore.
NOT proof of doping, but a flag for intelligent testing
Is this an indication of EPO use among elite distance runners? We don't know. It could be. But there are many other reasons that may explain why the records fell suddenly. This is the challenge with performance analysis. Please read this before sending in the hate mail and criticizing my cynicism, because I must emphasize that this kind of analysis does NOT prove doping! As Schumacher states in the paper, there are many other factors that could explain why performance suddenly improves, so one must be careful not to infer doping without acknowledging a wide range of potential contributing factors.
A limit to performance? Cycling may be an easier ask...
Therefore, this graph, or any other, does not constitute proof that athletes doped. What is does do is help us to understand performance better - is it possible that we can draw a dotted line on the graph to indicate where performance ends and doping MIGHT begin? Probably not (at least for now), but that is where this is headed. For cycling, I believe it is easier, and when you look at the climbing power outputs of Tour de France champions (shown again below), and then ask what the implications of riding at 6 W/kg are for the physiology, then I believe it is feasible to say that riding at a relative power output above about 6 W/kg for longer than 30 minutes raises doubts over physiological credibility (particularly when this is repeated day after day). This cycling case is intriguing, and warrants a post all of its own, which I will do when there is more time, perhaps after the IAAF World Champs.
The practical use of this information
The other application of this historical profiling is to highlight how certain athletes can be identified on the basis of performances that stand out, and then more intelligent testing can be done to confirm or refute the notion that doping is involved. One of the problems in the above graphs is that they represent a combination of many athletes each year, and the appropriate use of this kind of testing requires that an individual be tracked from year to year. I will say, for example, that in defence of the men's 5000m runners, that many young African athletes arrive in Europe for the very first time, having never set foot out of their village, and run sub-13 minutes. You'd have a difficult time convincing me that these young athletes (often juniors) are doping to break 13 minutes, and therefore I would propose that they are capable of 12:50 or faster.
Also, I know many of the coaches and scientists who work with the atheltes out of Nairobi, and I do believe in their integrity and approach to doping - they are adamantly against it. So there are clean athletes among those "20 best performances each year", I have no doubt. But equally, I'm sure there are some "dirty" performances as well - only testing will prove which is which, and that is why intelligent testing is required, and performance analysis might help us understand what we are seeing a little better.
To conclude - intelligent testing is the aim
I have little doubt that the most emotive retort to this argument that "exceptional performances" should be targeted on the basis that they may lack physiological credibility is that we are too cynical and don't "believe" - this approach was once famously used on the Champs Elysees to criticize those who had doubted a champion's credibility. We "don't believe in dreams", would be the charge...Unfortunately, it is partly true, and the climate within most sports almost compels us to react with suspicion when great performances are noted. In the words of Bengt Kayser, "open your ears and eyes and think" when it comes to doping and doping controls!
However, those who are clean would welcome this approach, because as long as it is done sensibly, it vindicates them and then everyone is a winner. It does not mean every great athlete is a doper, or should be targeted simply because they perform exceptionally, but rather that analysing great performances gives us every opportunity to test sensibly, and that benefits everyone.
I leave you with a quote, straight out of the paper in IJSPP by Prof Schumacher:
"A new approach could involve monitoring the rate of improvements in competition performance of an athlete from an early age, in combination with monitoring of blood values or steroid profiles once an appropriate level of competition is reached. Although sudden increases of performance can be induced by many reasons other than doping (improved training strategies, nutrition, growth in young athletes, etc), such observations are nevertheless worthwhile to trigger target testing of the athlete. In connection with data from blood and/or urine profiling, such “performance profiling” might improve the identification of suspicious athletes’ behaviors. In a similar context, mathematical analyses of winning patterns of gamblers are used with success to identify cheaters in casinos."
We'll be discussing this over the next week for sure, beginning with our interview tomorrow, as promised, so join us then!
Ross
I've always felt that statistical analysis were a weapon against doping. Statisticians could long before the common world really got awareness of the huge doping saga of particularly the ninety eighties have pointed out at the athletes in the meaning of circumstantial evidence. Only few did. But the problem was (is) of cause that statisticians, like we viewers also, form part of the whole scenery. But we talk about athletics here, where we measure time, distance and height. We convert wind and altitude. Where at least in top events like world championships we have almost identical circumstances. So what a treasure! It's not always easy, like the case of Usain Bolt proves. I'm still not convinced after his relatively blurring year.
ReplyDeleteWhere to look if you still have the sense that something isn't alright. Personally I think the hinge could be around his extraordinary sprint stamina. Certainly that was the case with the late Florence Griffith-Joyner. But again, this is speculation (not the fact of sprint stamina, but how ...) and you wind up pretty soon in circumstantial matter. How could also Pamelo Jelimo run such 800m times when she couldn't hardly put up a deasant long sprint? That's why, IMO, it's so extreme important that we saw two faces of Jelimo in 2008 and 2009. Did she or did she not expose herself or will she in the future? But in cycling it will be more difficult. Yes, when we analyzed all the SRM files of the last two decades. How should one approach the 9,572 200m sprint of Sireau, when he lost last saturday quite distintly of again Chris Hoy, who on his turn stunned the world in Beijing. Yes, these are confusing years, but interesting. Sometimes you wind up forgetting to watch the exploits itself.
"and then ask what the implications of riding at 6 W/kg are for the physiology, then I believe it is feasible to say that riding at a relative power output above about 6 W/kg for longer than 30 minutes raises doubts over physiological credibility (particularly when this is repeated day after day)."
ReplyDeleteHmmmm. I'm not so sure about that conclusion, at least not for all elite cyclists. 6W/kg for ~ 30-min (repeated)? Sure. e.g. a rider with a VO2 Max of 83ml/kg/min, with an efficiency of 23% riding at 90% of VO2 Max will do 6W/kg. None of those numbers are outside the realms of natural plausibility for athletes of that class.
As for power meter files (which is probably a bit OT), well one better have an excellent understanding of the technology, as power meter data can readily be wrong if the technology is not used correctly or data poorly interpreted if one is not well versed with the nature of the data collection/recording tools/methods.
Think I'm wrong - that the Pros know what they're doing with the power meters? Then ask Garmin (& Saris), whos riders (e.g. Wiggins) had power meter data posted from the Monaco ITT that made no sense. Then for the power meter supplier to admit they had the wrong setting enabled on the team's equipment after it was pointed out. Oops. You'd think they'd have realised that before posting it.
Do their mechanics calibrate the meters? Do the riders check the torque zero?
All one would end up doing is introducing "doping" to power meter data.
Of course it's a little ironic that the early adopters of using power meters in the Pro peleton were Lemond and Riis.
@alex
ReplyDeleteHi Alex,
In entirely agree on the importance of calibration issues with the powermeters. But I think the point made in the post is that some kind of performance measure might help to evaluate the credibility of performances. As you know better than most of us, there are pretty accurate models to predict power output, especially in cycling which puts certain findings into perspective. Its not about the exact wattage, its about a certain range. And if you see 6 W/kg for 30 minutes, it raises certain doubts..
In a great practical statistics course I attended, we learned about an interesting paradox in statistics. We learned it in terms of sampling many dilute water samples in search of a contaminated one, but it applies to ALL testing.
ReplyDeleteB1=Tested individual is using a banned substance.
B2=Tested individual is not using a banned substance.
A=Positive test of randomly selected individual from a target population.
Assume:
Probability{B1}=0.01 1 in 100
P{B2}=0.99
P{A|B1}=0.95 Meaning Probability of a positive test result given that the randomly selected individual is a doper is 0.95. which would give 5% false negatives.
P{A|B2}=0.01 or 1% false positives.
Then P{B2|A} (Probability of innocence given a positive result is
P{A|B2}*P{B|2}
---------------------
P{A|B1}*P{B1}+P{A|B2}*P{B2}
Plugging in our sample numbers above, you get 0.51. or 51% false positives! Trouble. 1 doper in 1000 gives a false positive rate of 0.91. 1 doper in 10 gives a false positive rate of 0.087. Still trouble if you don't want big court battles with the almost 1 in 10 people you get a positive test on.
Obviously, this is overly simplistic and actually pretty obvious, but what it illustrates is that profiling works! There is a reason we test elite the winners and not the losers. The population of winners (or at least elite contenders) has a much larger P{B1}. If you are going to do random testing, you really should do some targeting, or "smart" testing to avoid the use of the dirty profiling word.
Simple stats with some implications that back up the use of profiling... heartless! Oh well.
Hi Frans
ReplyDeleteThanks, as always for the comments. I know you've only recently started posting, but you always bring some valuable insight, so I wanted to say thanks for the time to read and then comment, not just here, but always!
I'm not sure if I commented on your last post regarding Semenya and Jelimo, but in case, I agree 100% with you regarding those two. It will be interesting to see what happens in the next few weeks - it is difficult to see how they would not do the testing, but obviously gender issues are very, very sensitive! Time will tell...
Regarding this one, you are correct, profiling each athlete would produce interesting data to inform subsequent tests. The problem is all these other factors - I just look at my own performances, as mediocre as they are, and it strikes me that I have huge variations because of injury, lack of motivation, illness, injury and so on! So I suppose the same would apply to anyone, which is why this performance profiling is a guide to intelligent testing, not the proof by itself!
Regards
Ross
Hi Alex
ReplyDeleteThanks as always for the comment. I think it's certainly vital to standardize how power output is measured. It's almost a given that it would have to be done properly, otherwise as you say, things will rapidly become ridiculous. ANd I can definitely believe you regarding how the pro teams would either err or deliberately change settings - I don't see that as an insurmountable problem. The principles is still correct.
And also, as someone posted after you, the measurements are just a guide for testing. In my opinion, if you have proper settings and functioning equipment (which is really a given), and you make all the necessary CONSERVATIVE assumptions, then you can still derive tremendous value because when you find that the physiology is still "impossible" despite all these conservative assumptions, you have a very serious flag.
I am planning a proper post on this in due course - after the IAAF World Champs.
Then lastly, I agree that 20 minutes at 6W/kg is possible, but only for 20 minutes. If you go up to 45 minutes, and you then say that elite guys ride at 90% of VO2max, then you have a problem. Unless you disagree that cyclists are able to ride at 90% of max for 35 minutes. I think it's closer to 85%, which changes the implications. Because then your guy with a VO2max of 83ml/kg/min has to have an efficiency of around 26%, or his VO2max must be 90ml/kg/min. I don't think those are possible, especially when you consider the inverse relationship between VO2max and efficiency (Lucia et al.) In fact, I don't know if I've ever seen anything above 23% and a VO2max over 80ml/kg/min at the same time (though I might be wrong - I can go back and look at this again).
But I reckon 6W/kg is the absolute max for 20 minutes, beyond this, you start to become suspicious. And when you look at guys riding at maybe 6.2W to 6.5W for 40 minutes, then you have a serious problem with credibility.
What do you think is the sustainable intensity for that duration of 40 minutes? I believe 90% is too high, though I'll concede that a lot depends on the protocol you use to measure the VO2max in the first place. For example, I've had experts say to me that 85% is too high, others say 90%.
Your thoughts?
Ross
Hi Energetich20
ReplyDeleteThanks for that! Strangest thing, I'd just read a book explaining much the same thing about testing and the chances of a false positive, in the context of stats paradoxes.
So, I've actually just sent an email through to a colleague with the anti-doping panels asking for some insights, so that I can do a post on this in the future, explaining the sensitivity of the tests and so forth!
So when I do that, I'll come back to your illustration and its implications!
Thanks!
Ross
What do you do with a performance like Bob Beamon's long jump in Mexico City? It seems that any huge leap (no pun intended) in performance could be due to legitimate factors (altitude, adrenaline, good day) and that cheaters will claim that their extraordinary performance is due to such factors. So where does that leave us with identifying "super performances" as cheaters?
ReplyDeleteI think the conclusion that the estimated data for Contador is correct because the estimate for Sorenson's reported actuals overstates the accuracy, because Sorenson is a 70kg rider, thus nearly exactly matching the assumption of the model. Contador isn't; his power is therefore substantially lower.
ReplyDeleteIt's not that I cast doubts about the attempt to use the data and make projections, it's the overclaiming of accuracy and relevance that I find problematic. The "normalized" 70/78kg readings have been willfully misapplied by many people to make cases that may be overstated.
Similarly, the graphs of data present in the current article suffer from baseline/start truncation, with a start chosen at 1989 that can lead to misrepresented conclusions.
It would perhaps be more appropriate to start back a decade earlier, and include a linear regression line of the performance trends. For example, from the data show, the best and average running times looks like at 2000 we've continued a progression from the early data, with outliers during the worst of the EPO era.
TBV
I'm not sure that the graphs in your article between the 5,000 and 10,000m track events and the cyclists power output/kg should be included in the same post without significant caveats. However, I'm a cardiologist and not a sports physiologist, so I may be talking out of my a__.
ReplyDeleteFirst, the track results are directly measured results (the analogy in the medical literature would be a "hard endpoint" such as mortality -- it certainly isn't hard to adjudicate that!) and are not estimates of performance as in the cyclist results. Second, if my assumptions are correct, the track events in the graph are stadium-based and not road events, and occur in an unchanging ovoid competition area, whereas the cycling results are from a venue that changes from day to day and from year to year. The climbs upon which the estimates are based have varying gradients, differeing atmospheric conditions, and they may occur at a different point in the race from year-to-year. Finally, in the cycling graph itself, are we comparing apples to oranges? Armstrong's 2004 numbers are from a ITT of an isolated Alpe d'Huez and not numbers produced on a climb at the middle or end of a long stage, as I'm assuming (perhaps wrongly) the other riders' estimates are from. It would be more fair to make the graph of performances on the same exact stage year-to-year, but alas this is not possible.
Hi there guys I am new to your blog and have been following it with interest over the tour de france and have found the doping and physiological performance issue fascinating. With reference to your cycling power graph I was interested to see Pantani's figure as the highest up until 2001. I recalled a figure from Matt Rendalls book " the death of Marco Pantani" reference his Hb and Hct and looked it up. On page 172 is a copy of his FBC from 1995, smack in the middle of the "epo epoch" as it were. His Hb was 20.8 and his Hct an eye watering 60! Given that in a Danish study (a touch ironic) Epo was found to improve performance by between 15-20% we could take 18% off his figure which would make it a much more believable 5.412 W/kg.
ReplyDeleteWe know now that the modern cyclist cannot get away with hcts above 50 and even if they approach that figure on a sustained basis the biological passport looking at trend and reticulocyte activity will"bust" them. What then are the modern cyclist doing and how are these astonishing figures becoming apparent?
PS am a sports doc and GP in the UK but I went to Wits. I am a rower by trade and a cyclist by necessity and yes, we used to beat UCT(mostly!) Really enjoy your blog, keep up the good work
John
You say, "However, those who are clean would welcome this approach."
ReplyDeleteThis is would be true only if they trust the testing and criteria used for those tests. Why should we?
Hi, Ross:
ReplyDeleteYou are starting your discussion (which is a main point for this article) with the following statement:
"so there are two categories of performance analysis that can be used to "flag" suspicious performances:
1. Detecting performances lacking physiological "credibility"
2. Historical analysis to detect the rate of performance change in individuals"
I have to agree that you point about the ability of sports science methods to serve as performance predictors is very valid. However, in order to do so we (exercise science at large) MUST step out of "measurment" mode. After all - our science is about ways to assist athletes and coaches to achieve maximum results and performance. Yes, we need to understand the subject. However just measuring it is not enough. The important component often missing - is the training strategy which leads to studied condition, or, in other words, optimum adaptation progression of functions responsible for energy supply. The rate of progression at its maximum - is pretty much constant (at least in our studies) and therefore, knowing the strategy, you can pretty much predict and monitor performance since it should stay within individual corridor (assuming that one can control adaptation of leading functions at the level of individual adaptation threshold).
IN your article you are discussing and comparing performance of athletes with different preparedness structure and different training strategy history. The absolute values we measure - are transitional and only valid as long as they are compared to individual progression curve and its limits. Understanding such limits and drivers of performance in time - may offer a solution to stated problm.
The last point: power meters measure EXTERNAL POWER only. By definition the values are hard to compare due to the fact that the same wattage can be achieved through different methods (technique and cadence - Concept 2 rowing ergometer is a perfect example to that and Concept 2 is a much more precise and calibrated instrument then cycling power meter).
Just couple of thoughts for whatever they worth... :-)
G'day guys
ReplyDeletePlenty of nourishing food for thought here. I would just like to point out that for athletes returning to competition a very large improvement in measured VO2 max in three months is actually quite easy as the example below illustrates.
For example, imagine a good runner who stopped running when they left college. At college the runner’s body mass was 60kg, and their absolute measured VO2 while running on the treadmill was 4500ml/min, giving a specific VO2 max of 75 ml/kg/min (4500ml/min ÷ 60kg).
Now imagine that, after leaving college, the runner keeps running three or so times a week for half an hour or so (so maintains basic fitness), but too much alcohol and too few miles take their toll. So, over the next five years the runner packs on 20 kilos to give them a body mass of 80kgs. (Happened to me).
Five years later then, the ex–competitive runner’s absolute measured VO2 while running on the treadmill has dropped from 4500ml/min to 4000ml/min giving a specific VO2 max of 50 ml/kg/min (4000ml/min ÷ 80kg).
Horrified, the ex-competitor decides to return to competition.
If he gives up the booze and the late nights, trains hard over the next three months and gets his absolute VO2 max back up to 4500ml/min and his body mass back down to 60 kg, then he will achieve a 50% improvement in his specific V02 max over that period (from 50 ml/kg/min back to 75 ml/kg/min).
PS Should have explained that, in my example, I assume that the ex-competitor's absolute VO2 measured on the treadmill dropped by roughly 10% over his five years of low grade exercise after leaving college, hence from 4500ml/min down to about 4000ml/min.
ReplyDeleteHi Ken
ReplyDeleteGood question. Here's my attempt at an answer: That record would have to be met with suspicion given the current climate of the sport. In 1968, it probably didn't warrant the suspicion, today it does. If anyone doesn't think that, they're kind of wilfully naive.
HOwever, suspicion does NOT constitute condemnation or judgement of a doped record - rather, in the current model, it is a flag for further testing. If that testing does NOT produce a positive finding, then you have to say that the record is clean. The point is, great performances are not dismissed as doped records, they are flags for subsequent testing, that's all.
On the note of long jump, it's very difficult to see an athlete doping for a benefit in long jump. For speed, yes, but a guy like Beamon did not even feature in the top few sprinters in his country - he was different to a guy like Carl Lewis, or even Joyner-Kersee. Therefore, my gut reaction is that long jump, and high jump, probably, are not events where doping is going to cause massive gains in performance.
And then finally, a once-off performance like that, when achieved in one, single moment, is probably less likely doped than sustained high level performance above a "physiological" limit. In other words, the fact that Beamon was so freakish is actually a suggestion that it's not doped. To me, the real flag is when an athlete can sustain a level that raises doubt - that's the suspicious performer - and you find them in cycling more than anywhere else, though it's also easier to measure in cycling.
Thanks!
Ross
Hi Prospero
ReplyDeleteNo, you have a point, but I still think the comparison from year to year is valid. What you are looking at in that cycling graph is an average power output on each of the final climbs of that year's Tour. So it is an average made up of 4 to 6 climbs for each data point.
You're quite right that there are variations - in weather, in road, in stage, in tactics, but by averaging, you still handle much of that variability and produce a number that still has great meaning for the purposes of this kind of analysis. The debate goes over whether the number is valid - there are thoughts that the model used to calculate these numbers overestimates the power by between 2 and 5%. So you might need to lower all numbers by about 0.2W/kg.
But the point is, there is a physiological limit to what is possible on the bike - in order to ride at 6.5W/kg for that length of time, a cyclist must have a VO2max of over 95ml/kg/min, AND have a very high efficiency, and this simply does not occur. Therefore, the only conclusion is that IF guys are riding at that level, they are doing so beyond physiological means.
At least, that's the principle, I'm sure it's subject to many criticisms!
But to sum up, you're right, the direct comparison doesn't work, but taking an average and then applying conservative assumptions makes the data very useful. I want to do a detailed post on this in the future, to clarify a lot of it, so I'll leave it for now!
Thanks!
Ross
Hi John
ReplyDeleteI've read that book, it's terrific. You're right, the state of the sport up to about 2005 was really bad. i do think it is improving, we'll be running an interview with one of the guys involved in doping control later today where that point is made a little more.
However, as to what they are doing now, I don't know. I've no doubt it's not clean yet. It's better, but not fixed, so to speak. I don't believe the figures from this year's Tour are credible, though many will criticize as cynical and pessimistic that assertion. I have it on authority that there are signs of doping, but they don't quite know what, just yet. I'm always reminded that guys like Bernard Kohl, and Danilo di Luca got away with doping for two years under the biological passport system, and eventually got nabbed, possibly when they made mistakes, who knows?
Time will tell, maybe more will be caught.
Ross
To anonymous
ReplyDeleteYes, you're right, but I also said in the post that one has to be sensible and clear in how you attempt to do this, because it can be a very effective means to understand the problems of doping and how testing might be done better.
So I don't see this as an insurmountable problem. If you directly measure power output on bike during the Tour, and make sure that the setup of those power meters is properly done, then you'll have very reliable data which have very profound implications for physiology. If your power meter shows that you produce 6.6W/kg for 40 minutes, then you're doing something that is almost physiologically impossible - either your VO2max is 100ml/kg, or your efficiency is 27% or higher.
Either way, we can test for that, and clear your name. Assuming you as the rider have nothing to hide. It's quite simple to set up a protocol that is "conservative" and gives every benefit of the doubt to the athlete. That's all that's required
Ross
Hi Colenso
ReplyDeleteYes, you're quite right. Couple of things though - I think in the context of doing this kind of performance "tracking", you'd be testing elite athletes, and so they all be in your "sample" when trained already. For example, you'd test a rider at the start of the season, say in April, and then again in July. But more than this, you would first monitor performance and then verify using the tests.
For example, a guy rides a climb lasting 40 minutes at 6.5W/kg. That means his VO2max is either 100ml/kg/min, or his efficiency is up near 27%. Both are highly suspect, so he might be flagged for testing, because there is something in the physiology that you would pick up in a controlled environment.
Then also, your example is fair, but the weight loss is unrealistic - you have the guy losing 20kg in three months...? I'm not sure that's possible, at least healthily...
But, in principle, you're right. I don't think it would happen that fast though - maybe 12 months.
Ross
The post states: "But the blue line, that average, very definitely heads downwards, after a period where it had begun to level off. For the top 20 athletes to all improve in a season is suggestive of a systemic change." I challenge both statements. First of all, I don't see any marked difference in the rate of change of the average times, and I would like to see some kind of statistical test used to defend the statement that the slope change is signficant. Second of all, having the average of 20 athletes improve is not at all the same as having the 20 top athletes all improve -- which would certainly be suspicious. Some improved, some got worse, and on average times improved. In fact the comparison is more invalid than that because the 20 one season are different than the 20 the next. Perhaps none of the 20 improved but a new lot of improved athletes came along.
ReplyDelete@asher
ReplyDeleteHi Asher,
I agree with the limitations you outlined in your comments but how do you explain that the "biased" selection that you mentioned as a potential reason for the pattern coincides so strikingly with the commercial introduction of EPO and levels of with the introduction of thw EPO urine test?!
Hi to the last two
ReplyDeleteOn that point, i don't think anyone would dispute that there are some assumptions. In fact, I even pointed out the same issue in the post - I said that these numbers represented "a combination of many athletes each year, and the appropriate use of this kind of testing requires that an individual be tracked from year to year."
So it's acknowledged.
The anonymous poster is right though - there is still something there, a flag that raises suspicions, which is what this post was about.
Regarding the statistical test - I did point out that the 10,000m chart showed less of a drop than the 5,000m one, so I acknowledge (again) the same point you're making in your comment - it was in the post already.
SO I agree with the anonymous poster that there is a striking pattern - yes, the stats would prove interesting, but I don't think the the principle relies on it in this case.
Ross
Ross, thanks for the reply and the lively discussion. As I've posted earlier, I'm an interventional cardiologist and not trained in the sports sciences. I would've e-mailed you directly if your address was available, but I just wanted to ask if you had a reading "curriculum" for cycling performance. For my own education, I'd like to read the original studies upon which the "ceiling" of cycling performance was determined. Do you have any recommendations? A bibliography would be good, as I'm at an academic medical center and can access the full articles easily. Thanks.
ReplyDeleteProspero
Hi Prospero
ReplyDeleteNo problem - thank YOU for your contribution and posting to the lively discussion!
There are a few good studies on cycling, particularly recently. I don't know of any looking at the "ceiling" of performance - as I said, that's something I'm trying to look at now and I'm working with a scientist from France on a paper that I'd like to submit. I will probably post a few ideas on the site in a little while.
But for the best cycling studies, check out an author called Lucia. He was the guy with the Spanish Banesto team, did some great work on the physiological requirements of the Tour de France about four or five years ago. There is one called "Inverse relationships between VO2max and cycling efficiency" that is very informative. The reference is Med Sci Sports Exerc. 2002 Dec;34(12):2079-84.
Then more recently, Yorck Schumacher has been involved in some awesome studies measuring power output during the Tour and Giro. In fact, in my very latest article (click "home"), I've actually got an interview with Yorck, and right at the bottom, I have included links to the abstracts of some of his studies, and they'd make the best starting point, I believe.
Each one is applied, and it will have references to other papers, so it should be a good start to the "curriculum".
I hope that helps you get to some good reading!
Ross
Thanks for your considered reply to my post, Ross and Jonathan.
ReplyDeleteI take your point about the rapid weight loss in my example. I agree that on the face of it, the runner shedding his excess 20kg in three months might appear at first look to be unhealthy, given current advice about weight loss to the general population. When I advise others on this topic, I advise them to limit their average weight loss to 0.5kg to 1kg per week, to allow their body to adjust. But that’s for lifelong fatties. For the ex-athlete, by contrast, it's quite surprising just what abstention from alcohol and the reintroduction of regular weekly competition will rapidly achieve. For example, coming out of the rugby season, from the end of May to the end of August (the UK track season), the only time of the year in the UK that I ran seriously, my “rugby body mass” would drop easily from about 75kg to my “runner’s body mass" of 62.5 kg - and in those days I didn’t drink at any time of the year.
Years later, as a much older runner, merely from giving up the booze and increasing my weekly runs frequency from three to five, my BM dropped from its high point of 80kg back down to 65kg in between two to three months - even without my returning to serious competition.
So perhaps for those of us who are "natural athletes" (the so-called mesomorphs amongst us?), rapid readjustment of our body mass is relatively easy and not as unhealthy as it might be for the rest of the population.
Dear Ross,
ReplyDeleteYou constantly argue that riding at 90% VO2max is physiologically impossible. Please allow me to be highly sceptical because there is ample scientific evidence about this value.
For example in cross-country skiing, Rusko (2003) states: "During 15, 30, and 50 km ski-races the fractional utilization of VO2max is 95, 90, and 85%, respectively. At the top-level for male, these races last ~35, 75 and 120 minutes respectively. It would thus be possible to ski at 90-95% VO2max for ~30-90 min.
In running, Billat et al. (2003) reported a similar range with women kenyan marathon runners running at ~85% vVO2max, whereas 10K male runners runs 10km races at 93-95% vVo2max. This is vVo2max, which is probably lower that %VO2max. Again this suggest that top athletes can run at 90-95% VO2max for ~30-90 minutes.
In cycling, the review paper "The science of cycling" (Faria et al., 2005) it is stated that successful professional cyclist have a LT2 of ~90% VO2max. LT2 being defined as the exercise intensity eliciting a lactate concentration of 4mmol/L, which could be an underestimation of the intensity that can be maintained for less than an hour, depending of the individual. In the tables from this same paper, one can see that values range up to 93.5% VO2max. Therefore, an exceptional individual such as L.A. who does not have an off the chart VO2max, must have an off-the-chart fractional utilization of VO2max, and a value of 93-94% would not be surprising, and even expected... These numbers, would change your calculations by quite a bit.
Hi Felix
ReplyDeleteThere is as much evidence that self-paced exercise lasting 40 minutes is done at less than 90%. In fact, most suggest 85%, and for exercise lasting 20 minutes, the relative intensity is about 90%. note the difference is that you're basing intensity on a theoretical construct of LT2 (for cycling) not actual measurements, which does introduce more variability. And I would suggest a big part of the variance is the method used to measure the VO2max. A great deal depends on the protocol, and the same can be said for LT1 and LT2. So I acknowledge that evidence, but there is as much going the other way, which should also be acknowledged. in fact, this highlights one of the problems with this method.
Regardless, even at 93% of VO2max, the ability to ride at 6.2W/kg is still limited, unless the athlete has a VO2max above 85 AND an efficiency above 24%. That combination, as far as I know, has never been documented, since VO2max is generally inversely related to efficiency (on this note, there is also little agreement on efficiency - the Spanish research produces DE values of 26%, most produce GE values of 22 to 23%).
Armstrong's highest measured efficiency (if you believe the controversial Coyle figures) is 23%, with a highest VO2max of 81 ml/kg/min.
The highest sustainable power output for this physiology,even at 93%, is 6.2W/kg. Mathematical modelling of Armstrong's time-trial up Alp d'Huez in 2004 estimated his power output at 6.9W/kg. That would require him to ride at 100% of VO2max. Is any cyclist that exceptional? Either that, or the cyclist's VO2max must be in the mid-90s.
There is data from thousands of cyclists, many elite and World champion level, that fails to monitor that they ride above 5.9W/kg for more than 20 minutes. Much of it has been published. And yes, lab based testing is likely to underestimate race performance, but somehow you still have to find 10%. All I am saying is that it is highly suspicious, and lacks physiological credibility. It's not proof, but it really does raise a flag.
Cheers!
Ross
The welsh girl, Becky James, 17 years and 257 years old (on one Photo at the BCF website she even lookes puberish (no offense of cause), rises to the occasion. From 200m in 11,82s at the Revolution Meeting in January to a 8 months later 11,093s New Junior World Record today. From 36,677s at the same Revolution Meeting to a 500m TT in 35,286s in July in Minsk and a 500m TT 35,784 Silver Medal last Tuesday.
ReplyDeleteThat's why I wrote the next comment on cycling weekly website.
Modern Track Cycling needs a thorough performance analysis in order to cope with certain rising issues. The avalanche of progress of certain riders in recent years IMO gets out of contention. I lost my unconditional believe in outstanding performances.
I'm curious if they accept this comment.
In a certain way it's rather sad that one has think this after watching some sports quite closely for quite a few decades.
On Cycling Weekly I didn't want to name the riders, but in particularly I meant British Track Cycling (quite vague but so be it), Taylor Phinney, Kévin Sireau and Bauge at Moscow recently, and of cause Becky James.
Don't Andrew's comments and your response to them apply equally to the problems of drawing conclusions based on what's shown on the power to weight TDF chart? Even if there are several climbs analysed together for most of these plots, there are still all lot of unknowns (the cyclists' actual weight at the time of each climb, wind, etc which you discuss) where assumptions have to be made. Especially if you want to draw conclusions from the single climb (Armstrong Alpe d' Huez) shown. Possibly your follow-up post will have aggregated climb data from tours since 2001, too. Maybe showing error bars to account for the likely variation in results based on assumptions made for unknowns would put the data in perspective.
ReplyDelete"Andrew said...
…. The thrust of this article seems to be that hard conclusions about the result of V's calculation cannot be drawn because there are many variables that may affect the result….
30 JULY 2009 12:18 AM
Ross Tucker and Jonathan Dugas said...
... The uncertainty is the problem, and until this can be reduced through direct measurement and known parameters, it will always be a grey area!...
30 JULY 2009 11:08 AM "
Dear Felix
ReplyDeleteMaybe the athletes the Rusco and Billat publications were based used doping?
Hi Rupert
ReplyDeleteYes, you're partly right. The average of four to six climbs, which is represented in that graph, will definitely reduce the error though. The main source of that error is that the power estimations don't take into account the wind speed and drafting. For obvious reasons, you'd overestimate power when the wind is from behind,and underestimate it when from the front.
Now, on a single climb, I can see that a prevailing wind direction may exist. I'm not convinced, because most climbs have a few switchbacks and enough changes of direction that the effect maybe the negated anyway, but a prevailing wind is reasonable. Taking 6 climbs, it's very difficult to see how overall, "in the wash", so to speak, that would become less of a factor. Sometimes it would be a headwind, others a tailwind.
It's an imperfect science, of course, and you're 100% right, the variability should be shown. Showing error bars, that is unlikely to sort it out, because you'd be guessing as to their size (what error is it? 5%, 10%?)
However, there's no doubt that the average of 6 climbs is likely to be closer to the actual value than a single climb, particularly when you look at trends over 10 years. It's pretty clear the average climbing power is 6.3W/kg or higher, and I don't think this is physiologically credible.
Ross
To the last anonymous poster:
ReplyDeleteTouche! You're 100% right, they may well have been! You have discovered a perfect circular argument!
Regards
Ross
Out of date (obviously) but this is the result of the qualifying from the UK 2003 Junior sprint champs:
ReplyDelete1st Mark Cavendish 11.387s (21-05-85) 18+
2nd Matthew Crampton 11.448s (23-05-86) 17+
3rd Richard Morton 11.679s
4th Neil Cooper 11.855s
5th Jason Kenny 11.985s (22-03-88) 15+
Most of the time such comparisons
aren't appropiate, but doesn't Becky James' sprint of 11,093s although at the famous Krylatskoje velodrome not look quite remarkable?!
Dear Ross,
ReplyDeleteI do respect your competence in your field of study, but as a scientist I find it annoying that you reply to my comment, in which I cite three well-respected researchers, by a comment with general statements without proper citations. I could undertsand that LT2 is an artifact of modeling, vVo2max is surely not, and between the papers I cited (amongst others)the values are consistent. Are you implying that Rusko, Billat and Faria are out of track???
I get your point about 23% efficiency and Vo2max at 93%, but again, where does that calculation comes from? I read back the Coyle's story and can't find the proper citation. My guess is that the formalae you use abundantly is from a line passing trough a cloud of data points with a correlation coeficient of 0.6-0.75 Hence, lots of uncertainties.
I understand that you are running a blog, not a scientific journal. I also believe that the use of performance enhancing drugs is a shame and must be fighted. However, I stand firm that riguour is required when implying that an individual uses drugs. As Mr. Schumacher replied to one comment on the next subject, there are athletes out there who win clean and they deserve our greatest respect. Surely these individuals will have extraordinary capacities in order to win against dopers!
I didn't read the last posts about my comments before posting the last one. Well if they were all doped, it would indeed be quite a circular argument! I think this is at the edge of paranoia however. In Faria et al's study, he used profesionnal cyclists, I could buy that most of them were doped. You will never make me believe that Kenyans are all doped, however. The top runners are always different from year to year and it is just impossible that there is efficient (in the meaning of never been detected) systematic doping in such a poor country... As for Rusko, he studied skiers from a country where systematic doping is known to have taken place, so it would be possible, but he present tables about juniors as well, which makes it less likely. The bottom line is that if we can't rely on studies about top-level athletes because they are suspected of doping, than we have no ideas of the physiological characteristics of champions. One thing for sure they have to be much higher than the characteristics of "well-trained endurance athletes"...And using the latter as a proof of doping is also quite a circular argument...
ReplyDeleteHi Felix
ReplyDeleteThanks for the reply, and your balanced view. Firstly, I do apologize for not citing sources better, I should have been clearer in where my position came from. Blaming time will seem trite, but I genuinely responded that comment in a two minute window before I had to rush off to a meeting, so I wrote as I thought. Apologies.
To address that: I'm not implying they're out of track, but I am saying that my experience, and knowledge of testing and working with elite and sub-elite cyclists is that the sustainable power output is between 85% and 90% VO2max. That is from experience, and from that of colleagues who have done literally thousands of tests and trials on athletes. I realise that "unpublished observations" do little to "prove" anything (this data must be published), however, I'll dig around and find this evidence for you now that the weekend is here and I have some time to search properly.
To comment on the study I know best, the BIllat one, if you look at the figures, the vVO2max is actually very slow - 22.7km/h and 21.6km/h for the two groups. We've done studies in our laboratory where athletes with 10km best times of 29:30 reach speeds of 22.5km/h. Billat's subjects were much faster (28:50 for one group). That difference is a function of the test used, because Billat used 3 minute stages, ours is a 1 minute stage and so the ramp is greater, leading to a higher peak speed. The high peak velocities comes from a phD study currently being written up,the protocol is in Noakes, 1990.
The point is, vVO2max is very variable, depending on the method used. And so the calculated value of 95% may be a function of an underestimate of the velocity at VO2max. Also, in their method, they define vVO2max as "the lowest running speed maintained for more than one minute that elicited VO2max". I'm not sure exactly what the implication of this is for their value - would the vVO2max be greater if this definition was simply the average velocity or the highest velocity? I suspect so, and then the relative intensity falls a lot - point is, the measurement of a vVO2max is variable.
Note also that none of these studies have actually measured VO2 during the performances - they have extrapolated based on speeds and performances during max tests. This makes the timing of testing vitally important, and I have read the paper again, but still find no mention of the timing of the 10km trials in relation to the VO2max tests. I can't even figure out whether Billat has made the subjects do a VO2max test AND 10km trial, or just the max test. If it's just the max test, as it seems to be, then there is a major problem, because those 10km best times are measured during peak season, whereas Billat's VO2max and vVO2max are measured at the start of the season in April. The result of this is that the 10km running speed will much higher relative to vVO2max than it should be. I'd bet that these same athletes, tested 3 months later, would have VO2max values 2 to 4% higher, and vVO2max values 5% higher. I'm not 100% sure about Billat's comparison, then.
Finally, just to note, I appreciate that you respect the science, and also that this is not a journal. That doesn't excuse a lack of scientific stringency (for which I have apologized for leaving out citations), and I agree that we must be rigid. However, I also believe I am being, based on my experience and that of colleagues - I am not one to give compulsive and reckless opinions. As I said, now that the weekend is here, I hope to be able to dig up those citations...
Ross
AAAhhh! The Devil's always in the details! Which you cannot know if you are not a specialist in that field of study. Thank you very much for this in-depth insight. Sorry for being a pain. I've heard so (too) many scientists in my own field of study just stating things they take as facts, but that are just their perception of what has been really published. Consequently, I am very slow and prudent before accepting one's view. And hell, this is your fault. You decided to do such an interesting blog:)
ReplyDeleteHi Felix
ReplyDeletePlease don't apologize! Your comments are exactly the ones we love to get, and you're just the person we enjoy having as a reader, because you bring a perspective we lack, as a result of your own experiences. This discussion around relative intensities has been enormously valuable for me as well!
Like it or not, every scientist interprets information according to their "model", so it's all perception, to some extent. But like I said, as long as it can be debated, then knowledge advances.
So I don't know which area of science you're from, but your "slow and prudent" nature serves you well, it's valuable, so again, don't apologize! For all we know, those athletes do race at 93% of VO2max - it's never been measured, only extrapolated, but hopefully the extrapolation is as accurate as possible! That's where debate helps!
Thanks for reading!
Regards
Ross
Very interesting blog and discussion. I am not an athlete or scientist, just a fan of the Tour each year and Track and Field as well. I admit that I am totally inspired by Lance and started watching for him. But having said that, I have often wondered, How is it possible that he does it so well, so often, so "off the charts" at times? I hope it is skill, will and outstanding physical, natural performance alone. I believe in testing and find the doping issue intriguing. Just finished reading Game of shadows, about Balco (steroid scandal) and all that business. The performance, the science and even the "trickery" is what's most interesting to me. And I love when a cheat is revealed with sureity....but then I wonder about testing methods, measures, validity and all that..... OMG, it's got me hooked! Great blog and discussions. I'll keep reading.
ReplyDeleteLauriM
Hi there i know its not directly related to the original post but its still relevant- I just saw cycling news and Bernard Kohl's latest. He says his manager bribed lab workers to test samples repeatedly so they could work out a micro-dosing regime that would test negative- this makes sense and it wouldn't surprise me if lab workers had been bribed to do a whole lot more!! This is where performance analysis and the biological passport need to be used in conjunction with each other. If a performances appear to be out of this world then they generally are - target said rider and you will eventually get him- witness Di Luca.
ReplyDeleteJohn