KOM Informatics Login Signup Documentation Blog

Signup For Free!

The KOM Informatics service is free of charge during the beta period which is expected to run until early-mid 2024. Please click the checkbox on the CAPTCHA, and click [Submit]. We'll send you to the registration form for KOM Informatics.

KOM Informatics: KPower

A Heartrate Based Algorithm For Estimating Power

So I see this a lot in my Strava feed: People with one bike, (usually a road bike) with a power meter and multiple bikes (cross bikes, gravel bikes mountain bikes) without. IMO, to the extent they ride off road unequipped with power meters, training load is pretty severely underreported. When I look at some of these same cyclists in the context of group road rides I see, IMO the opposite problem - the estimated average power seems to be too high for the effort.

A couple of the early testers of the KOM Informatics system fit this multi-bike profile and have requested a better solution. Although I don't own any off-road bikes (yet) I still have the problem when my power meter equipped Madone is in the shop and I have to rely on my old Madone with no power.

So I decided to create an algorithm that would predict power from heartrate for KOM Informatics. It works by using the rides you have uploaded that contain both power and heartrate data (source rides) as a baseline for making predictions. It decides what rides to use for the baseline by matching statistics about the ride the prediction is for (target ride) with the rides that comprise the baseline. In general then, the more source rides you have uploaded to KOM Informatics, the better the predictions will be.

To evaluate how well the algorithm was working I built a form which allows selection of a target ride which in this case does have both power and heartrate and then calculates the predicted power (which I am calling KPower) and plots it against the actual power recorded for the ride on a graph. Viewing the graph allows one to see whether interval efforts in the original power meter ride were covered by the estimated watts generated from the KPower algorithm. The code was later extended to record summary information about each comparison into a database table for later analysis. Here's an example graph:

Lambertville Intervals

Using this method I recorded summary information for 40 target rides each, for 2 different cyclists. Each cyclist had 20 outdoor and 20 trainer rides in their set of test data. The rides represented the most recent uploads by category (Outdoor vs Trainer). Other then this, no other criteria was used in selecting target rides. Cyclist 1 had 1286 eligible source rides, Cyclist 2 had 144. I did this a number of different times, tweaking the parameters of the algorithm to find out which version yielded the best results, which are directly below.

KPower Accuracy: All Available Source Rides
Cyclist Outdoor Or Trainer Average Difference Weighted Average Power Actual vs Predicted Standard Deviation Actual vs Predicted Accuracy %
1 O -7.2 7.5 95.88
1 T -0.7 8.1 96.81
2 O 3.6 16.5 92.57
2 T -15.45 25.9 88.01

Ideally both the [Average Difference Weighted Average Power Actual vs Predicted] and the [Standard Deviation Actual vs Predicted] would be relatively low for both cyclists and ride categories. Just a low average difference isn't enough. If a set consists of 2 rides both of which have a WAP of 200W, the associated predictions could be 100 W (100 watts low) for the first ride and 300W (100 watts high) for the second and still have 0 average difference Weighted Average Power. Standard deviation measures the amount of dispersion or variation amongst a set of values. Going by this standard the trainer rides for Cyclist 2 were considerably off both in terms of average and standard deviation. I did a little digging to find out why.

So it turns out that 9 of Cyclist 2's 20 trainer rides were under 30 minutes in duration, and another 5 were under 1 hour. (I found that accuracy decreased for shorter rides (r(78) = -0.4642, p < .05 (statistically significant)). 12 of those trainer rides participated in multiple ride days and served purposes like ramp testing and cooldowns from intense interval sessions. The ramp test rides were preceded by much easier ones serving as a warmup, and were characterized by a much higher cardiac efficiency (Weighted Average Power/Average BPM) then the source rides. The cooldown rides from interval sessions had the opposite problem, a much lower cardiac efficiency then the source rides as the cyclist started the cooldown with an extremely elevated heartrate and was pedaling easy watts.

Since pretty much everyone has power of 1 kind or another on trainer rides nowadays I think the trainer results in general should be afforded much less weight in judging how well the KPower algorithm works. I included them in the interests of completeness and transparency and also in the hope that they might reveal something interesting about how the algorithm works in the real world. One lesson I learned from Cyclist 2's trainer rides is that anytime the algorithm loses context by when a day's riding activity is divided into multiple rides, then accuracy suffers. The same type of situations can occur outdoors as well; low cardiac efficiency because of nerves in a warmup ride before a race, or low cardiac efficiency on a cooldown ride after a race.

The issues surrounding Cyclist 2's trainer rides are a couple of examples of a more general class of issue known as a confounding variable. A confounding variable is one that is not accounted for in a prediction like the KPower prediction, that can act as an external influence and change the outcome of the prediction. There are a number of these confounding variables which can be involved in KPower predictions including:

To mitigate the effects of confounding variables in KPower calculations, we'll educate users on situations where they arise, and provide a mechanism in the software to adjust KPower wattage prior to uploading a ride. For all of the tests referred to in this article, this mechanism wasn't used, IOW the adjustment was 0.

KPowerAdjustment

KPower: Results Differences Between Cyclist 1 & 2

Sharp eyed readers may have caught the differences in accuracy between Cyclist 1's and Cyclist 2's results. Perhaps some of the difference can be attributed to the difference in number of source rides. But I did a little digging and found that accuracy decreased for shorter rides (r(78) = -0.4642, p < .05 (statistically significant)). Cyclist 2's average test ride was 58:20 vs 2:06:49 for Cyclist 1. It may be that the presence of confounding variables in the shorter rides accounts for this effect rather then ride duration itself. I also calculated a Watts Choppiness Index for each of the test rides, this is a measure of the degree of watts fluctuation from moment to moment. A ride that is mostly easy, but is punctuated by many short anaerobic efforts will have a high Watts Choppiness Index. I found that a higher Watts Choppiness Index was a moderately strong predictor of a less accurate prediction (r(78) = 0.4191, p < .05 (statistically significant)). Cyclist 1's average Watts Choppiness Index was 64.24 vs 78.133 for Cyclist 2. The takeaway here is that KPower predictions for watts choppier rides will be a little less accurate then those for smoother rides.

KPower: Effect Of Only 20 Source Rides

People new to the KOM Informatics system won't have the benefit of having many source rides in the system before potentially uploading KPower rides. So we investigated the effect of having only 20 source rides in the system for each Cyclist before making predictions for the target test sets. Results are as follows:

KPower Accuracy: Limit Of 20 Source Rides
Cyclist Outdoor Or Trainer Average Difference Weighted Average Power Actual vs Predicted Standard Deviation Actual vs Predicted Accuracy %
1 O -8.8 9.8 94.95
1 T 5.05 9.5 95.82
2 O 6.7 23.6 90.19
2 T -7.45 24.1 89.75

Compared to the results for the KPower Accuracy: All Available Source rides table above there was less then 1% difference in accuracy (All Available: 93.32%, Limit 20: 92.67%). When a ride gets uploaded without power there will be validation check for at least 20 eligible source rides before calculating KPower; this should be enough to ensure good accuracy.

KPower: Comparison To Strava's Estimated Power

I got curious about how well the KPower predictions would stack up against Strava's estimated power. The selection of rides for this comparison wasn't random because I wanted to achieve a blend of certain types of rides. My opinion was that the Strava formula tended to underestimate power for rides which involved off-road effort, and overestimate power for road group rides. So I included 1 mountain bike ride, and 2 road group rides (1 of these was a road race). One ride, Battenkill included mixed terrain, and mixed group/solo efforts. The remaining 6 rides were solo road rides where IMO, Strava estimated power does pretty well. One of this latter group (Readington Loop) I picked because it involved a number of Z6 and Z7 efforts (WattChoppinessIndex:72.26), and KPower was 21 watts off with it, so I was particularly curious about how Strava's formula would handle it.

I used a third party tool called FitFileRepairTool both to strip the power, and advance all of the timestamps on each of the rides. The latter step is neccesary to avoid getting flagged for submitting a duplicate ride in both systems. Then I just uploaded these rides to both systems, the Power based rides were already there for comparison. The results are as follows:

Ride Title Weighted Avg Power Power Meter (KOM Informatics) Weighted Avg Power KPower Accuracy % KPower Weighted Avg Power Power Meter (Strava) Estimated Avg Power Strava Estimated Accuracy % Strava Estimated Strava Screenshot (PM) Strava Screenshot (Est)) KOM Informatics PM Screenshot KOM Informatics KP Screenshot
CRCofA 4 Lap Race "A" Group 253 256 98.82 243 300 81.00 Power Meter Estimated Power Meter Estimated
Lambertville Tempo 222 225 98.66 215 209 97.20 PowerMeter Estimated PowerMeter Estimated
Neighborhood Stroll 190 169 88.94 183 180 98.36 Power Meter Estimated Power Meter Estimated
Tour Of The Battenkill (Men's Cat 5) 244 235 96.31 235 240 97.91 Power Meter Estimated Power Meter Estimated
PFW "A" Ride 215 215 100.00 206 242 85.12 Power Meter Estimated Power Meter Estimated
Flat Z2 (Remembering How To Balance & Pedal) 180 178 98.88 173 180 96.11 Power Meter Estimated Power Meter Estimated
Z2, Z3 Canal + Woodfern 211 214 98.59 207 212 97.64 Power Meter Estimated Power Meter Estimated
Canal - Hillsborough Loop (1x8, 1x4 VO2Max) 227 215 94.71 220 207 94.09 Power Meter Estimated Power Meter Estimated
Chimney Rock MTB 153 171 89.40 138 106 76.81 Power Meter Estimated Power Meter Estimated
Readington Loop (Took it easy and cut it short after brake issue) 208 187 89.90 192 148 77.08 Power Meter Estimated Power Meter Estimated

KPower was more accurate then Strava estimated power for 8 of these 10 rides. Overall KPower had a 95.421 % accuracy rate vs 90.132 % for Strava estimated power. Both algorithms performed well on solo road rides: KPower 94.94% accuracy, Strava Estimated 93.41% accuracy. For those 4 rides which involved group efforts and/or mixed surface condition KPower achieved 96.13% accuracy, Strava Estimated achieved 85.21%.

Takeaways

Version 1 of the KPower algorithm achieved design goals. The KPower heart rate based algorithm can accurately predict power (95.421 % accuracy rate) on smooth roads to offroad technical terrain for both group and solo rides . It was more accurate then Strava's estimated power algorithm based on the rides selected. KPower predictions for watts choppier rides will likely be a little less accurate then those for smoother rides, although this is likely an issue for any estimated power algorithm. The effect of confounding variables which might otherwise rob any heartrate based power prediction of accuracy are partially mitigated through a select list which lets the user adjust the wattage the KPower algorithm emits.

+
Contact Us

Address

Hillsborough, NJ USA

email