Click for the solution. Only click if you are struggling or are out of time.
View(sleep_data)🚧 This section is being actively worked on. 🚧
Pre-requicite for these exercises was covered within the session and is therefore not repeated here. The exercises build on this material.
The learning objectives for this session are:
You have been asked to prepare a quick summary of a dataset of 303 chest pain patients to support a potential clinical trial collaboration. “Nothing complicated—just mean and standard deviation for a few variables.” You open RStudio…
Exercise 1A: Import the Cleveland Heart Disease dataset into R.
View the dataset sleep_data from the website. Import the data in your RStudio session. How many variables does it have?
View(sleep_data)Exercise 2A: View the dataset.
Why does this distinction matter when calculating a mean?
Exercise 3A:
What happens when you take the mean of a character string? e.g., mean("hi")? Why do you think this happens?
Exercise 4A:
Extract one variable (e.g., Age or Cholesterol) and:
Exercise 1B:
Calculate the mean and standard deviation for:
Exercise 2B:
Check if there are missing values in the dataset.
is.nan())?Exercise 3B:
List all the variable types the mean() function can take the mean of. There should be 5. Seek help inside RStudio.
Exercise 1C:
Consider now what you took the mean and standard deviation of in exercise 3B. How can you interpret this? What can we say from this?
Exercise 2C:
A colleague suggested calculating the mean of Sex.
Exercise 3C:
Why is it important to report both mean and standard deviation, and not just the mean? Think in a clinical context.