401077 Introduction to Biostatistics
Assignment 1supplementary – Due Sunday December 6
Please answer each question in spaces provided below and return to Paul Fahey ([email protected]) on or before Sunday December 6 for marking. The marks allocated to each question are shown in the assignment. A total of 20 marks are available and this assignment is worth 30% of your overall grade.
When submitting your assignment you are implicitly ticking these statements:
?I retain a backup file of this assignment in case the original file is lost or damaged.
?I hereby certify that no part of this assignment or product has been copied from any other student’s work or from any other source except where due acknowledgement is made in the assignment.
?I hereby certify that no part of this assignment or product has been submitted by me in another (previous or current) assessment.
?I hereby certify that no part of the assignment has been written or produced by any person.
?I hereby certify that no part of this assignment has been made available to any other student.
?I am aware that this work will be reproduced and submitted to plagiarism detection software for the purpose of detecting possible plagiarism. This software may retain a copy of this assignment on its database for future plagiarism detection.
?I understand that failure to uphold this declaration may result in academic proceedings in line with the UWS Student Academic Misconduct Policy.
Your name:
Your student number:
Question 1: (2 marks)
Which of the following are dichotomous variables:
a) Study identification number
b) Number of people in the study
c) Age in years
d) Whether right-handed or left-handed
e) Number of brothers and sisters
Question 2: (3 marks)
a) Using the ‘fham.p1.RData’ data set introduced in tutorial 3 and R Commander, create a table showing the frequency distribution of highest education attained (educ_f). (1 mark)
b) Create a graph of the frequency distribution of highest education attained (educ_f) (1 mark)
c) What proportion of people in the ‘fham.p1.RData’ data set have at least some college education (this includes a college degree)? (1 marks)
Question 3: (4 marks)
This chart shows the glucose measurements recorded in the ‘fham.p1.RData’ data set introduced in tutorial 3.
a) Identify the type of chart. (1 marks)
b) What, if anything, can be said about the centre, spread and shape of this distribution? (3 marks)
Question 4: (3 marks)
Use the ‘fham.p1.RData’ data set introduced in tutorial 3 and R Commander to calculate descriptive statistics for body mass index by sex. In one or two paragraphs, describe the centre, spread and shape of the distribution of body mass index in females in this sample. This description should incorporate the relevant results from the R Commander output (but not the output itself).
Question 5 (4 marks)
a) Using the ‘fham.p1.RData’ data set introduced in tutorial 3 and R Commander, create a table showing the relationship between angina and blood pressure medications (bpmeds).(1 mark)
b) If you were to select one person at random from this data set, what is the probability they would be using blood pressure medications?Show any working. (1 mark)
c) If you selected one person using blood pressure medications at random from this data set what is the probability they would haveangina?Show any working.(1 mark)
d) Are angina and blood pressure medication independent? Explain why or why not. (1 mark)
Question 6: (4 marks)
The height of Australian adult males is Normally distributed with a mean of around 176cm and a standard deviation of 7 cm.
a) Selecting one Australian adult male at random, what is the probability that they are between 175cm and 180cm tall? Show any working. (2 marks)
b) Estimate theheight of the tallest 5% of adult Australian males. Show any working. (1 mark)
c) Suppose six samples of 50 adult Australian men were measured: one each from Sydney, Melbourne, Brisbane, Adelaide, Perth and Hobart. Estimate the standard error of these 6 means.(1 marks)