PUBH 6033/8033
PUBH 6033—Week 10 Assignment 2
Identifying Risks and Hazards—Part 2
(Rubric included)
Instructions
For this Assignment, review this week’s Learning Resources. You will refer back to and use the SPSS output generated in this week’s Assignment 1 for this assignment as well. This output is based on the asbestos.sav dataset that related to the incidence of lung cancer for those exposed to asbestos and those not exposed. Be sure to refer back to that output and thenprovide your response to all items in this worksheet.
Submit this Application Assignment by Day 7.
———————————————————————————————————————
Note:Scores are to be entered by instructor
1. From the SPSS output generated in this week’s Assignment 1, copyonly your odds ratio analysis (“Risk Estimate”) portion of the output and paste, below:
Answer: ____ / 10 points
2. The pvalue associated with a chisquare test only suggests whether or not the results are statistically significant. Why is it important to also look at the odds ratio?
Answer: ____ / 10 points
3. What does the odds ratio value in the SPSS output tell you, specifically, about lung cancer and exposure to asbestos?
Answer: ____ / 10 points
4. Based on your answer above, would you say there is a strong association between asbestos exposure and lung cancer?
Answer: ____ / 10 points
5. From the SPSS output generated in this week’s Assignment 1, copyonly your “asbestos * lung cancer Crosstabulation”portion of the output and paste, below:
Answer: ____ / 10 points
6. Using the formula provided in this week’s Learning Resources, and the data in the crosstabulation output, calculate the odds ratio and show your work in your answer, below.
Answer: ____ / 10 points
Section below (scoring) is to be completed by Instructor
Scoring Rubric for Week 10 Assignment 2
Identifying Risks and Hazards—Part 2(60 points)
Timeliness

Fully Met

Partially Met/Good  Partially Met/Fair  Not Met 
Timeliness Indicators  Posted by the deadline  Posted 1 day late  Posted 2–4 days late  Posted 5 or more days late 
Grade impact  No impact  10% reduction in overall assignment score  20% reduction in overall assignment score  0 points 
Initial Score (60 possible points):
Timeliness Factor(late points deducted):
Total Score(60 possible points):
Instructor comments:
Week 10: StepbyStep Guide for calculating odds ratios and risk ratios
This StepbyStep Guide demonstrates how to calculate odds ratios, risk ratios, cumulative incidence, incident density, and prevalence.
Odds Ratio (OR):
The most common use of probability or odds in Public Health is the odds ratio. This is calculated from a simple 2 x 2 table set up as follows:
Exposure to variable of interest  Existence of Disease designated as case (yes disease) or control (no)  
Case  Control  
Yes exposed  a  b  All Exposed (a + b) 
No not exposed  c  d  All Not exposed (c + d) 
Total  All Cases (a + c)  All Controls (b + d)  Total sample 
We can take this table and fill in the values from page 29 in our text:
Exposure to Tobacco Smoke  Existence of Cancer case (yes disease) or control (no)  
Case  Control  
Yes exposed (smoker)  40  29  All Exposed (69) 
No not exposed  10  21  All Not exposed (31) 
Total  All Cases (50)  All Controls (50)  Total sample 
You can observe several things immediately about this 2 x 2 table. It is divided into an equal number of cases and controls and so is based on a casecontrol study design (you will learn more about this in Epidemiology).
The odds ratio estimates relative risk when you only have a sample to work with as in a casecontrol study. The formula for the odds ratio is the odds of disease in the exposed divided by the odds of disease in the nonexposed. Using the letters from the table this is:
(a/b) / (c/d) or with the numbers it is: (40/29)/(10/21) = 1.379 / .476 = 2.897
A common shortcut to this calculation is multiplying (a x d) and then dividing this by (b x c).
The interpretation is that smokers in this sample are 2.9 times as likely to get cancer as nonsmokers.
Risk Ratio or Relative Risk (RR):
Risk ratios also use 2 x 2 tables but they are mostly based on prospective studies and so the cases and controls are not evenly divided. They provide a more accurate assessment of the relative risks of disease based on exposure. The 2 x 2 table is set up as above but the calculation for RR is different from that for OR.
The formula for relative risk is based on a comparison of the risk to the exposed to the risk to the unexposed. The risk to the exposed is Number exposed who are ill/Number exposed. The risk to the unexposed is Number unexposed who are ill/Number unexposed.
So RR = Risk in exposed /Risk in unexposed. Using the 2 x 2 this is [a/(a + b)]/[c/(c + d)]. When we use the numbers from our table it is: (40/69)/(10/31) = 0.58/0.322 = 1.8
Note that this is a much lower estimate than that of odds ratio and assumes that all the people in the prospective study were observed for the same length of time.
Here is a quick comparison of Odds Ratio and Relative Risk.
Odds Ratio (OR)  Relative Risk (RR) 
Case control studies  Cohort studies 
Focus is on the exposure of interest  Focus is on disease occurrence 
Best when disease is rare and exposure is more common  Best when exposure of interest is rare and disease more common 
Using 2×2 table (a x d)/( b x c)  Using 2×2 table (a/a + b)/(c/c + d) 
Cumulative Incidence (CI), Incidence Density (ID), and Prevalence (P):
There is often confusion over these so you need to use care with the definitions.
Incidence versus Prevalence:
To understand the difference between incidence and prevalence you need to identify a new case from an existing one. When you consider the prevalence of something you are taking a snapshot of the current situation. You are not looking at when the disease occurred or how long it has been present. You are simply dividing the number of people with a given disease or characteristic by the total number of people you are observing at that time.
Prevalence = number with trait / total number
For example, if you wanted to know the prevalence of brown eyes in a room full of people you would count the number of people with brown eyes and divide this number by the number of people in the room. With incidence you are only looking for new instances of that characteristic. It is unlikely that anyone will develop brown eyes while you are observing the group so for this example we will use diabetes. Whereas prevalence is the number with diabetes compared to the total at any given time, the incidence is the number who develop diabetes compared to the total while under observation. This introduces the element of time.
Cumulative Incidence and Incidence Density (aka Density Incidence):
The difference between these is how the time is handled. In cumulative incidence a group of people is observed during a set period of time and the number of people who develop a characteristic is divided by the number of people available to develop that characteristic. Cumulative Incidence = number of new cases of a disease or trait/ total number of people at risk for the disease or trait. The denominator excludes people who are not at risk for developing the trait such as those who already have it or who are immune. The time period for observation is most often a year. This is often expressed as the future risk of developing a disease. An example is the prediction that there will be 200 new cases of diabetes diagnosed in a given town in 2011. This is based on the actual number of new cases observed in 2010 adjusted for changes in the size of the town’s population since then.
In Incidence Density, the individuals in the groups are not all observed for the same length of time and so the numerator remains the number of new cases but the denominator includes not just the number of people at risk but the time they have been observed.
Incidence Density = number of new cases of a disease or trait/ total number of people at risk for the disease or trait multiplied by the amount of time they were each observed. The denominator for this is expressed as persontime and is also often measured in years, but will be different for each individual. This is often used when looking at people born at different times. Consider a study conducted in 2000that looks at the medical history of 100 people since their birth. In each case, they either developed the disease (cancer) over their lifetime or they did not. Each individual in the study was born in a different year, so each individual contributes a different amount to the denominator. You are probably very familiar with Life Insurance mortality tables, which use this type of calculation to determine the risk of death (lifespan) to an individual assuming they have already reached a certain age.
Some sample calculations of Prevalence, Cumulative Incidence, and Incidence Density:
You have the following data:
There is a group of people you have been observing for 2 years to see who develops diabetes. The group started with 200 people, but in the 2 years, several of them have moved away, died of another disease, or been diagnosed with diabetes. At the end of the study there were 160 people left in your group. In Year 1, 10 of the people in the group developed diabetes, while in Year 2, 12 people developed diabetes. Assume that there were no cases of diabetes in the group at the beginning of the 2 years of observation and that all people were at risk of developing the disease.
To calculate the prevalence of diabetes at the end of Year 1, you would divide 10 by 200.
P_{(end of Year 1)}= 10/200 = 5% assuming there were no losses from the group in that first year
At the end of Year 2, the prevalence would be (10 (from Year 1)+ 12 (from Year 2))/160
P_{(end of Year 2)} = 22/160 = 13.75%
To calculate the Cumulative Incidence in Year 1, you would again look at the cases from the first year. You will note that the new cases are equal to the total cases after the first year of observation, as there were no cases when you started the observation.
CI_{(end of Year 1)} = 10 new cases/200 total at risk in Year 1 = 5% So in this situation CI and P are the same.
In Year 2, however, you are only looking at new cases divided by the number at risk at the start of Year 2. Since 10 people were diagnosed by the start of Year 2, the total number still at risk of developing the disease is 200 – 10 = 190.
CI_{(end of Year 2)} = 12 new cases/190 total at risk in Year 2 = 6.3% note that the CI does not consider changes in the number of people at risk due to dropouts but does consider the reduced number due to those already diagnosed.
The most challenging, but also most accurate, calculation uses the Incidence density. For this you need to add all of the person years contributing to the calculation of risk. At the end of Year 1, the only loss to follow up was based on the 10 diagnosed, so 190 people each contributed 1 personyear for a total of 190, the 10 diagnosed were diagnosed at different times during the year and so each contribute a fraction of the year to the total. The easiest way to do this is with a table:
Number diagnosed  When diagnosed  Multiplier based on time of diagnosis  Total PersonYears 
4  1st quarter  .25  1 
2  2nd quarter  .50  1 
4  3rd quarter  .75  3 
10  5 
This adds 5 additional persontime years to the denominator for Year 1, so
ID = 10 new cases/195 persontime years at risk = 5.1%
During the second year, 12 additional people were diagnosed and a total of 40 were lost to either diagnosis or followup, so you had 150 people left to contribute one persontime year and 40 people to contribute a partial year. Again, you can use a table to calculate their contribution to the total:
Number diagnosed or lost to followup  When diagnosed or lost  Multiplier based on time of diagnosis or loss  Total PersonYears 
20  1st quarter  .25  5 
8  2nd quarter  .50  4 
12  3rd quarter  .75  9 
40  18 
Number left contributing 1 person year = 150
The total persontime years at risk for this calculation is 150 + 18 = 168.
So ID = 12 new cases/168 persontime years at risk = 7.1%
To sum it up:
Year 1:
Prevalence = 5%
Cumulative Incidence = 5%
Incidence Density = 5.1%
Year 2:
Prevalence = 13.75%
Cumulative Incidence = 6.3%
Incidence Density = 7.1%