WEEK 6- Write a program to construct a Bayesian network considering medical data. Use this model to demonstrate the diagnosis of heart patients using a standard Heart Disease Data Set.
WEEK 6: Write a program to construct a Bayesian network considering medical data. Use this model to demonstrate the diagnosis of heart patients using a standard Heart Disease Data Set.
Theory
A Bayesian network is a directed acyclic graph in which each edge
corresponds to a conditional dependency, and each node corresponds to a unique
random variable.
Bayesian network consists of two major parts: a directed acyclic graph
and a set of conditional probability distributions
ยท
The directed acyclic graph is a
set of random variables represented by nodes.
ยท
The conditional probability
distribution of a node (random variable) is defined for every possible outcome
of the preceding causal node(s).
For illustration, consider the following example. Suppose we attempt to turn on our computer, but the computer does not start (observation/evidence). We would like to know which of the possible causes of computer failure is more likely. In this simplified illustration, we assume only two possible causes of this misfortune: electricity failure and computer malfunction.
The
corresponding directed acyclic graph is depicted in below figure.
The goal is to calculate the posterior conditional probability distribution of each of the possible unobserved causes given the observed evidence, i.e. P [Cause | Evidence].
Data
Set:
Title: Heart Disease Databases
The Cleveland database contains 76 attributes, but all
published experiments refer to using a subset of 14 of them. In particular, the
Cleveland database is the only one that has been used by ML researchers to this
date. The โHeartdiseaseโ field refers to the presence of heart disease in the
patient. It is integer valued from 0 (no presence) to 4.
Database: 0 1 2 3 4
Total
Cleveland: 164 55 36 35 13 303
Attribute Information:
1. age: age in years
2. sex: sex (1 = male; 0 = female)
3. cp: chest pain type
ยท Value 1: typical angina
ยท Value 2: atypical angina
ยท Value 3: non-anginal pain
ยท Value 4: asymptomatic
4. trestbps: resting blood pressure (in mm Hg on admission to the
hospital)
5. chol: serum cholestoral in mg/dl
6. fbs: (fasting blood sugar > 120 mg/dl) (1 = true; 0 = false)
7. restecg: resting electrocardiographic results
ยท Value 0: normal
ยท Value 1: having ST-T wave abnormality (T wave inversions and/or ST elevation or depression of > 0.05 mV)
ยท Value 2: showing probable or definite left ventricular hypertrophy by Estes' criteria
8. thalach: maximum heart rate achieved
9. exang: exercise induced angina (1 = yes; 0 = no)
10. oldpeak = ST depression induced by exercise relative to rest
11.slope: the slope of the peak exercise ST segment
ยท Value 1: upsloping
ยท Value 2: flat
ยท Value 3: downsloping
12. ca = number of major vessels (0-3) colored by flourosopy
13. thal: 3 = normal; 6 = fixed defect; 7 = reversable defect
14.Heartdisease: It is integer valued from 0 (no presence) to 4. Diagnosis of heart disease (angiographic disease status)
Some instance from the dataset:
age |
sex |
cp |
trestbps |
chol |
fbs |
restecg |
thalach |
exang |
oldpeak |
slope |
ca |
thal |
Heartdisease |
63 |
1 |
1 |
145 |
233 |
1 |
2 |
150 |
o |
2.3 |
3 |
o |
6 |
o |
67 |
1 |
4 |
160 |
286 |
o |
2 |
108 |
1 |
1.5 |
2 |
3 |
3 |
2 |
67 |
1 |
4 |
120 |
229 |
o |
2 |
129 |
1 |
2.6 |
2 |
2 |
7 |
1 |
41 |
o |
2 |
130 |
204 |
o |
2 |
172 |
o |
1.4 |
1 |
o |
3 |
o |
62 |
o |
4 |
140 |
268 |
o |
2 |
160 |
o |
3.6 |
3 |
2 |
3 |
3 |
60 |
1 |
4 |
130 |
206 |
o |
2 |
132 |
1 |
2.4 |
2 |
2 |
7 |
4 |
Comments
Post a Comment