The focus of predictive modeling or predictive analytics is to use statistical techniques to predict outcomes and/or the results of an intervention or observation for patients that are conditional on a specific set of measurements taken on the patients prior to the outcomes occurring. Statistical methods to estimate these models include using such techniques as Bayesian methods; data mining methods, such as machine learning; and classical statistical models of regression such as logistic (for binary outcomes), linear (for continuous outcomes), and survival (Cox proportional hazards) for time-to-event outcomes.
A Bayesian approach incorporates a prior estimate that the outcome of interest is true, which is made prior to data collection, and then this prior probability is updated to reflect the information provided by the data. In principle, data mining uses specific algorithms to identify patterns in data sets and allows a researcher to make predictions about outcomes.
Regression models describe the relations between 2 or more variables where the primary difference among methods concerns the form of the outcome variable, whether it is measured as a binary variable (i.e., success/failure), continuous measure (i.e., pain score at 6 months postop), or time to event (i.e., time to surgical revision). The outcome variable is the variable of interest, and the predictor variable(s) are used to predict outcomes. The predictor variable is also referred to as the independent variable and is assumed to be something the researcher can modify in order to see its impact on the outcome (i.e., using one of several possible surgical approaches).
Survival analysis investigates the time until an event occurs. This can be an event such as failure of a medical device or death. It allows the inclusion of censored data, meaning that not all patients need to have the event (i.e., die) prior to the study's completion.