Using Machine Learning to Predict Cardiovascular Disease

Provide the citation and attach a pdf of the article.

Weng, S., Reps, J., Kai, J., Garibaldi, J., & Qureshi, N. (2017). Can machine-learning improve cardiovascular risk prediction using routine clinical data? PloS One, 12(4), E0174944.

https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0174944&type=printable

What is the abstract of the article?

Background

Current approaches to predict cardiovascular risk fail to identify many people who would benefit from preventive treatment, while others receive unnecessary intervention. Machine-learning offers opportunity to improve accuracy by exploiting complex interactions between risk factors. We assessed whether machine-learning can improve cardiovascular risk prediction.

Methods

Prospective cohort study using routine clinical data of 378,256 patients from UK family practices, free from cardiovascular disease at outset. Four machine-learning algorithms (random forest, logistic regression, gradient boosting machines, neural networks) were compared to an established algorithm (American College of Cardiology guidelines) to predict first cardiovascular event over 10-years. Predictive accuracy was assessed by area under the ‘receiver operating curve’ (AUC); and sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) to predict 7.5% cardiovascular risk (threshold for initiating statins).

Findings

24,970 incident cardiovascular events (6.6%) occurred. Compared to the established risk prediction algorithm (AUC 0.728, 95% CI 0.723-0.735), machine-learning algorithms improved prediction: random forest +1.7% (AUC 0.745, 95% CI 0.739-0.750), logistic regression +3.2% (AUC 0.760, 95% CI 0.755-0.766), gradient boosting +3.3% (AUC 0.761, 95% CI 0.755-0.766), neural networks +3.6% (AUC 0.764, 95% CI 0.759-0.769). The highest achieving (neural networks) algorithm predicted 4,998/7,404 cases (sensitivity 67.5%, PPV 18.4%) and 53,458/75,585 non-cases (specificity 70.7%, NPV 95.7%), correctly predicting 355 (+7.6%) more patients who developed cardiovascular disease compared to the established algorithm.

Conclusions

Machine-learning significantly improves accuracy of cardiovascular risk prediction, increasing the number of patients identified who could benefit from preventive treatment, while avoiding unnecessary treatment of others.

Was the study experimental or non-experimental? Explain, tell us what made that clear.

Experimental. They gave the same input to four different machine learning models and to the established algorithm that was developed by the American College of Cardiology. They then calculated the results for each model and compared the success of the machine learning models to the success of the algorithm.

Was the research qualitative or quantitative? Again, explain.

Quantitative. The study judged the success of two models at predicting whether each participant in the study would develop cardiovascular disease based on their medical data. The two models’ success depended on their accuracy, in the form of a percentage.

What was the population studied? Why do you say that?

The article states that the participants were all “patients from UK family practices, free from cardiovascular disease at outset.”

What sample was used for this study? Explain.

378,256 patients were selected. There were several restrictions that kept patients from being used in the study. “Individuals with a previous history of CVD, lipid disorders which are inherited, prescribed lipid lowering drugs, or outside the specified age range prior to or on the baseline date were excluded from the analysis.” In addition, patients were required to have “complete data for the eight core baseline variables (gender, age, smoking status, systolic blood pressure, blood pressure treatment, total cholesterol, HDL cholesterol, and diabetes)”

What was the method of measurement?

The models were tasked with predicting whether participants had a greater than 7.5% chance of having cardiovascular disease. 75% of the participants were put into the training groups and 25% were put into the validation group. The data from the training group was used to train the machine learning models and the data from the validation group was used to test the percentage of predictions that were false positives and false negatives. Because of this, the ratio measurement scale was used.

What was the method of analysis?

The researchers analyzed the prevalence of cardiovascular disease for participants based on several characteristics, such as their ethnic background, where they lived, and what health conditions they had. They also analyzed the c-statistics for each different machine learning model’s predictions.

What was the conclusion of the study?

The machine learning models successfully predicted a greater percentage of cases of cardiovascular disease than the established algorithm. Specifically, the machine learning model using neural networks performed the best, predicting 3.6% more cases than the established algorithm.

Why is this study useful to you? Explain in detail.

I am interested in working with machine learning. This study shows that machine learning can perform better than algorithms that were developed by experts in their fields. Results like this will cause machine learning to be implemented in new areas where it was once believed it wouldn’t be useful.

What would be the next logical step in extending this study?

They could study patients from places outside of the UK, study a greater number of patients, create new models, and continue the study for a period longer than 10 years.

4 thoughts on “Using Machine Learning to Predict Cardiovascular Disease”

Dalton Miller says:

March 7, 2022 at 12:49 pm

Hi Lucas,

I think you chose a great article to review. Using technology in the medical field is a huge advantage that we should continue to take advantage of. This study has taught me a lot of how machine learning and algorithms can be beneficial for finding diseases, compared to the old way. Seeing that this study proved that the machine learning algorithm showed a higher percentage of correct diagnosis, it is imperative that the medical field continues to improve upon machine learning and other AI.

Dain Grimes says:

March 7, 2022 at 1:17 pm

This article was very similar to mine. I feel like when I evaluated mine I was split between qualitative and quantitative. My particular article related to quantitative in a similar fashion of analyzing numbers and data to make better decisions. However, I considered mine to be more qualitative, because it had incorporated new methods of data collection to manage databases. While they both relate to numbers, I can’t help but also look at the significance of a more qualitative approach. Either way, glad to see more health technology being used to increase the field of health.

Amanda Rowen says:

March 7, 2022 at 1:29 pm

This study is very interesting. I was not aware of what exactly machine learning was until reading this article. I would agree this could improve healthcare and how we treat patients especially those with chronic disease process or predisposition to them. This field of study I am sure will just continue to grow and advance. It does seem that it will take significant time for growth since it would be difficult to determine effectivness without studying the group over an extend amount of time.

klburks says:

March 7, 2022 at 3:15 pm

Hi Lucas! This was a fascinating article. Cardiovascular disease runs in my family, so as I get older it is something I am always wary of. I do have high cholesterol already at age 37 and I am taking a lipid-reducing medication. I was not familiar with machine learning, but I think it is amazing that this technology exists.

s22 Research Methods in Informatics

INF 405 Spring 2022