The use of data analytics to improve running form and reduce risk factors in middle to long distance runners

The use of data analytics to improve running form and reduce risk factors in middle to long distance runners

Files

Vermeulen_Data_2018.pdf (49.88 MB)

Date

2018

Authors

Vermeulen, Euodia

Publisher

University of Pretoria

Abstract

Fitness trackers equipped with accelerometers and global positioning systems are becoming more popular among the running community. These devices allow runners across the spectrum of athletic abilities to monitor their running metrics and track their performance throughout their chosen routes. The size of the data sets and the frequency at which it is generated place the tracking data from these devices into realm of big data. There are calls from research elds focused on human locomotion during running to capitalise on the data from tness trackers, in order to evaluate athletes in the real world and outside of the sometimes unrealistic laboratory or clinical settings. Unfortunately, the real world adds noise to the data and the signal from the data becomes obscured. This dissertation explored the large tracking data sets from runners' running watches to evaluate the extent of the noise and the possibilities to extract the signal from the data. Data are cleaned and parametric as well as non-parametric regression analysis models are tted to the data to nd interactions and aggregation methods that present the athlete with a picture of his/her running form. These models may provide an athlete with a better understanding of their own capabilities, which will help them improve their running form and reduce risk factors attributed to poor form. Results from the interaction models between running surface, cadence and pace suggest that the running surface do have an e ect on cadence and running pace. However, the distribution of pace per cadence level is extensive and skew in either direction, with the R2a -values for the tted models ranging between a weak 0.155 and moderate strength of 0.752 for the four case studies. The spread for road gradients (i.e. slopes) per cadence level is large and also skew in either direction. The R2a -values for the interaction models for slope, cadence and pace range between 0.268 and 0.681. The data visualisations for graded running is able to show the pattern of the data to a limited extent. The aggregated distribution curves for cadence and pace serve as an extension on the interaction between running surface, cadence and pace. Although all the distribution curve models had a R2- value very close to 1, the generalised additive model outperformed the shape constrained model with lower AIC-scores to t a smoothed line that represents the overall performance of the athlete. The shape constrained models failed to pick up segmented improvements in the running metrics, where the generalised additive models did pick up the changes in the slope of the curves where the athlete's performance improved. The data from tness trackers seem to hold potential to extend sport science research in running, however the data may not always be a true representation of reality. This may be due to its varying veracity and slow algorithm responses to changes in performance.

Description

Dissertation (MEng)--University of Pretoria, 2018.

Keywords

UCTD

Citation

Vermeulen, E 2018, The use of data analytics to improve running form and reduce risk factors in middle to long distance runners, MEng Dissertation, University of Pretoria, Pretoria, viewed yymmdd <http://hdl.handle.net/2263/71044>

URI

http://hdl.handle.net/2263/71044

Collections

Theses and Dissertations (University of Pretoria)
Theses and Dissertations (Industrial and Systems Engineering)

Full item page

The use of data analytics to improve running form and reduce risk factors in middle to long distance runners

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Sustainable Development Goals

Citation

URI

Collections