Heart Disease Data Set
Creators:
Janosi, Andras; Steinbrunn, William; Pfisterer, Matthias; Detrano, Robert
Publication Date:
1988
Data Category:
Dataset Description:
The Heart Disease database is a well-regarded resource in the medical research community, particularly for studies related to cardiovascular conditions. It comprises data from four distinct databases: the Cleveland Clinic Foundation, the Hungarian Institute of Cardiology in Budapest, the V.A. Medical Center in Long Beach, California, and the University Hospital in Zurich, Switzerland. Each of these databases contains patient records with various medical attributes, totaling 76 features. However, most research has focused on a subset of 14 key attributes to diagnose the presence of heart disease. he dataset is relatively small, with each database containing a few hundred records. For example, the Cleveland database includes 303 instances. Given the number of attributes and instances, the dataset's size is minimal, making it easily manageable for analysis without requiring significant storage resources. The data was collected over several years, primarily during the 1980s.
Each patient record in the dataset includes the following 14 attributes commonly used in research:
- Age: Age of the patient in years.
- Sex: Gender of the patient (1 = male; 0 = female).
- Chest Pain Type (cp): Categorical variable indicating the type of chest pain experienced, with values ranging from 0 to 3.
- Resting Blood Pressure (trestbps): Resting blood pressure in mm Hg upon hospital admission.
- Serum Cholesterol (chol): Serum cholesterol level in mg/dl.
- Fasting Blood Sugar (fbs): Binary variable indicating if fasting blood sugar is greater than 120 mg/dl (1 = true; 0 = false).
- Resting Electrocardiographic Results (restecg): Categorical variable with values 0 to 2 indicating ECG results.
- Maximum Heart Rate Achieved (thalach): Maximum heart rate achieved during exercise.
- Exercise-Induced Angina (exang): Binary variable indicating if exercise-induced angina occurred (1 = yes; 0 = no).
- ST Depression (oldpeak): ST depression induced by exercise relative to rest.
- Slope of the Peak Exercise ST Segment (slope): Categorical variable with values 0 to 2.
- Number of Major Vessels Colored by Fluoroscopy (ca): Integer value ranging from 0 to 3.
- Thalassemia (thal): Categorical variable indicating blood disorder status (3 = normal; 6 = fixed defect; 7 = reversible defect).
- Diagnosis of Heart Disease (target): Integer value ranging from 0 to 4, indicating the presence and severity of heart disease.
Variables:
Details: