Despite the "information rich" nature of the healthcare business, not all data are mined to reveal unseen patterns or knowledge and enable sound decision-making. Nowadays, a persistent need is increased for data mining to discover new information from different datasets and apply it in the healthcare system to help healthcare professionals in decision making for predicting diagnosis and treatment for different diseases, such as cardiovascular disease (CVDs). In the current era, it is estimated that cardiovascular diseases constitute 30% of the global mortality rate annually. Predicting cardiovascular diseases at the right moment is challenging for cardiac consultants. The medical industry can benefit from classification and prediction models that aid the effective use of medical data. This capstone project aims to improve CVDs prediction using publicly available databases on cardiovascular disease which is the Kaggle Heart Disease database.

Following CRISP-DM Methodology steps, several machine learning classification models for cardiovascular disease diagnosis will be used depending on attribute inputs, like: cholesterol, gender, BP, FBS, ECG changes finding records, and other related attributes. The data mining classification techniques, namely: 1.Logistic Regression, 2.Decision Tree, 3.Random Forrest, 4.Support Vector Machine will be used.

In conclusion, our capstone demonstrates to predict heart disease by the potential using data analysis and machine learning technique. This approach could provide a more accurate and non-invasive method for detecting heart disease in individuals, which could lead to earlier treatment and better patient outcomes.

Publication Date


Document Type

Master's Project

Student Type


Degree Name

Professional Studies (MS)

Department, Program, or Center

Graduate Programs & Research (Dubai)


Sanjay Modak

Advisor/Committee Member

Ehsan Warriach


RIT Dubai