prediction

The HBCP Prediction Tool uses semantically-enhanced Machine Learning to predict smoking cessation rates in scenarios specified by users. The machine learn draws on our annotations of around 500 RCTs of smoking cessation interventions covering around 1000 intervention descriptions.

We believe that this tool is the first of its kind in behavioural science, using machine learning on a dataset of study reports annotated using an ontology to make predictions of likely outcomes of behavioural interventions in novel scenarios, combining information about the intervention content and delivery, the target population, the setting and features of the studies.

As an initial prototype, this provides a proof of concept that this approach can work and provide researchers, policy-makers and intervention designers with important insights into intervention effectiveness, drawing on a much richer database of information than is found in traditional systematic reviews.

Some of the key limitations of the prediction tool to date are:

The data only includes intervention features that were documented in the published papers. We know that these descriptions are incomplete and this can compromise the accuracy of prediction.
Because the features coded were not specifically randomised on, there will likely be statistical confounding between features, so that we cannot guarantee that the apparent impact of a particular intervention component is not attributable to some other component that it happens to be associated with.
Because the data come exclusively from randomised trials and reporting of trials is often selective, we cannot assume that the trials represent all the information that has been collected. For example, if negative findings have been selectively excluded from the body of published literature, that will lead to an overestimate of the impact of some intervention components.
In smoking cessation trials, people lost to follow-up are typically regarded as having gone back to smoking. This assumption will be differently valid for different study populations and methods. This means, for example, that studies that typically involve less engagement with the participants (e.g. trials of digital interventions) will have lower reported success rates because of higher loss to follow-up.

You can find more details of the prediction tool and its development and evaluation here.

And you can find a 10-minute video giving instructions on using the prediction tool here.

Special thanks to Anna Kleinau for upgrading the user interface.