Description
Welcome to Survival Analysis in R for Public Health!
The three earlier courses in this series covered statistical thinking, correlation, linear regression and logistic regression. This one will show you how to run survival â or âtime to eventâ â analysis, explaining whatâs meant by familiar-sounding but deceptive terms like hazard and censoring, which have specific meanings in this context. Using the popular and completely free software R, youâll learn how to take a data set from scratch, import it into R, run essential descriptive analyses to get to know the dataâs features and quirks, and progress from Kaplan-Meier plots through to multiple Cox regression. Youâll use data simulated from real, messy patient-level data for patients admitted to hospital with heart failure and learn how to explore which factors predict their subsequent mortality. Youâll learn how to test model assumptions and fit to the data and some simple tricks to get round common problems that real public health data have. There will be mini-quizzes on the videos and the R exercises with feedback along the way to check your understanding.
Prerequisites
Some formulae are given to aid understanding, but this is not one of those courses where you need a mathematics degree to follow it. You will need basic numeracy (for example, we will not use calculus) and familiarity with graphical and tabular ways of presenting results. The three previous courses in the series explained concepts such as hypothesis testing, p values, confidence intervals, correlation and regression and showed how to install R and run basic commands. In this course, we will recap all these core ideas in brief, but if you are unfamiliar with them, then you may prefer to take the first course in particular, Statistical Thinking in Public Health, and perhaps also the second, on linear regression, before embarking on this one.
Tags
Syllabus
- The Kaplan-Meier Plot
- What is survival analysis? Youâll see what it is, when to use it and how to run and interpret the most common descriptive survival analysis method, the Kaplan-Meier plot and its associated log-rank test for comparing the survival of two or more patient groups, e.g. those on different treatments. Youâll learn about the key concept of censoring.
- The Cox Model
- This week youâll get to know the most commonly used survival analysis method for incorporating not just one but multiple predictors of survival: Cox proportional hazards regression modelling. Youâll learn about the key concepts of hazards and the risk set. From now and until the end of this course, thereâll be plenty of chance to run Cox models on data simulated from real patient-level records for people admitted to hospital with heart failure. Youâll see why missing data and categorical variables can cause problems in regression models such as Cox.
- The Multiple Cox Model
- Youâll extend the simple Cox model to the multiple Cox model. As preparation, youâll run the essential descriptive statistics on your main variables. Then youâll see what can happen with real-life public health data and learn some simple tricks to fix the problem.
- The Proportionality Assumption
- In this final part of the course, youâll learn how to assess the fit of the model and test the validity of the main assumptions involved in Cox regression such as proportional hazards. This will cover three types of residuals. Lastly, youâll get to practise fitting a multiple Cox regression model and will have to decide which predictors to include and which to drop, a ubiquitous challenge for people fitting any type of regression model.
Survival Analysis in R for Public Health
-
TypeOnline Courses
-
ProviderCoursera
The three earlier courses in this series covered statistical thinking, correlation, linear regression and logistic regression. This one will show you how to run survival â or âtime to eventâ â analysis, explaining whatâs meant by familiar-sounding but deceptive terms like hazard and censoring, which have specific meanings in this context. Using the popular and completely free software R, youâll learn how to take a data set from scratch, import it into R, run essential descriptive analyses to get to know the dataâs features and quirks, and progress from Kaplan-Meier plots through to multiple Cox regression. Youâll use data simulated from real, messy patient-level data for patients admitted to hospital with heart failure and learn how to explore which factors predict their subsequent mortality. Youâll learn how to test model assumptions and fit to the data and some simple tricks to get round common problems that real public health data have. There will be mini-quizzes on the videos and the R exercises with feedback along the way to check your understanding.
Prerequisites
Some formulae are given to aid understanding, but this is not one of those courses where you need a mathematics degree to follow it. You will need basic numeracy (for example, we will not use calculus) and familiarity with graphical and tabular ways of presenting results. The three previous courses in the series explained concepts such as hypothesis testing, p values, confidence intervals, correlation and regression and showed how to install R and run basic commands. In this course, we will recap all these core ideas in brief, but if you are unfamiliar with them, then you may prefer to take the first course in particular, Statistical Thinking in Public Health, and perhaps also the second, on linear regression, before embarking on this one.
- The Kaplan-Meier Plot
- What is survival analysis? Youâll see what it is, when to use it and how to run and interpret the most common descriptive survival analysis method, the Kaplan-Meier plot and its associated log-rank test for comparing the survival of two or more patient groups, e.g. those on different treatments. Youâll learn about the key concept of censoring.
- The Cox Model
- This week youâll get to know the most commonly used survival analysis method for incorporating not just one but multiple predictors of survival: Cox proportional hazards regression modelling. Youâll learn about the key concepts of hazards and the risk set. From now and until the end of this course, thereâll be plenty of chance to run Cox models on data simulated from real patient-level records for people admitted to hospital with heart failure. Youâll see why missing data and categorical variables can cause problems in regression models such as Cox.
- The Multiple Cox Model
- Youâll extend the simple Cox model to the multiple Cox model. As preparation, youâll run the essential descriptive statistics on your main variables. Then youâll see what can happen with real-life public health data and learn some simple tricks to fix the problem.
- The Proportionality Assumption
- In this final part of the course, youâll learn how to assess the fit of the model and test the validity of the main assumptions involved in Cox regression such as proportional hazards. This will cover three types of residuals. Lastly, youâll get to practise fitting a multiple Cox regression model and will have to decide which predictors to include and which to drop, a ubiquitous challenge for people fitting any type of regression model.