Abstract:
There exists a need to estimate the potential financial, epidemiological and societal impact that diseases, and the treatment thereof, can have on society. Markov processes are often used to model diseases to estimate these quantities of interest and have an advantage over standard survival analysis techniques in that multiple events can be studied simultaneously. The theory of Markov processes is well established for processes for which the process parameters are known but not as much of the literature has focussed on the estimation of these transition parameters. This dissertation investigates and implements maximum likelihood estimators for Markov processes based on longitudinal data. The methods are described based on processes that are observed such that all transitions are recorded exactly, processes of which the state of the process is recorded at equidistant time points, at irregular time points and processes for which each process is observed at a possibly different irregular time point. Methods for handling right censoring and estimating the effect of covariates on parameters are described. The estimation methods are implemented by simulating Markov processes and estimating the parameters based on the simulated data so that the accuracy of the estimators can be investigated. We show that the estimators can provide accurate estimates of state prevalence if the process is stationary, even with relatively small sample sizes. Furthermore, we indicate that the estimators lack good accuracy in estimating the effect of covariates on parameters unless state transitions are recorded exactly. The methods are discussed with reference to the msm package for R which is freely available and a popular tool for estimating and implementing Markov processes in disease modelling. Methods are mentioned for the treatment of aggregate data, diseases where the state of patients are not known with complete certainty at every observation and diseases where patient interaction plays a role.