-
Notifications
You must be signed in to change notification settings - Fork 9
/
Copy pathChapter3.Rmd
228 lines (145 loc) · 5.41 KB
/
Chapter3.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
# Ch. 3 Discrete Distributions {-}
Sections covered: 3.1 - 3.6
Skip: 3.6 "The Poisson Distribution as a Limit" (pp. 132-33) and "The Poisson Process" (pp. 134-135)
There is no direct way in R to calculate the mean and variance of a distribution given its pmf. However we can use R to make the calculations simpler.
## 3.3 Expected value {-}
```{r}
x <- 1:5
x
px <- c(.1, .15, .2, .25, .3)
px
x*px
sum(x*px) # E(X)
```
## 3.3 Variance {-}
```{r}
x - 3.5
(x - 3.5)^2
((x - 3.5)^2)*px
sum(((x - 3.5)^2)*px) # V(X)
```
## 3.3 Variance (alternative method) {-}
```{r}
x
px
x^2
(x^2)*px
sum((x^2)*px) # E(X^2)
14-3.5^2 # E(X^2) - [E(X)]^2
```
## 3.4 Binominal Theorem {-}
p. 121 Using binomial tables for the cumulative distribution function (cdf) is optional; you may use R (see below) or [www.stattrek.com](http://www.stattrek.com) instead.
For tests you will not need to calculate these values. You can leave your answers as summations, for example:
$\Sigma_{x=0}^6 \left(\begin{array}{c}12\\ x\end{array}\right)(.3^x)(.7^{12-x})$
### R {-}
Probability mass function (pmf)
$P(X = x)$
```{r}
choose(8, 3) # "8 choose 3"
56*.6^3*.4^5 # P(X = 3) given n = 8, p = .6
```
Direct method
```{r}
dbinom(3, 8, .6) # P(X = 3) given n = 8, p = .6
```
Cumulative distribution function (cdf)
$P(X \leq x)$
Find $P(2 \leq X \leq 4)$ given $p = .7, n = 15$
```{r}
pbinom(4, 15, .7) - pbinom(1, 15, .7)
```
(Using the pmf instead:)
```{r}
dbinom(2, 15, .7) + dbinom(3, 15, .7) + dbinom(4, 15, .7)
```
Graphing a binomial pmf:
ex. $p = .7, n = 10$
```{r}
px <- dbinom(0:10, 10, .7)
# adding las = 1 turns the y-axis tick mark labels horizontal, which are easier to read
barplot(px, names.arg = 0:10, las = 1, col = "lightblue")
```
## 3.5 Hypergeometric {-}
Note that the notation that R uses is different from the Devore textbook:
|parameter|Devore|R|
|---------|------|-|
|total successes|M|m|
|total failures|N-M|n|
|sample size|n|k|
|successes in sample|x|x|
Example (p. 127)
Devore: *h(x; n, M, N)*
*P(X = 2) = h(2; 10, 5, 25)* -->
```{r}
dhyper(x = 2, m = 5, n = 20, k = 10)
```
## 3.6 Poisson {-}
Example (p. 132)
p(3;2) =
```{r}
dpois(3,2)
```
F(3;2) =
```{r}
ppois(3, 2)
```
## Practice Exercises {-}
1. **(Binomial)** Suppose the probability of a car accident involving a single vehicle is .7. If 15 accidents are randomly selected, what is the probability that between 2 and 4, inclusive, involve a single vehicle? (from slides)
```{r}
dbinom(2, 15, .7) + dbinom(3, 15, .7) + dbinom(4, 15, .7)
```
or
```{r}
pbinom(4, 15, .7) - pbinom(1, 15, .7)
```
-----
2. **(Binomial)** A particular telephone number is used to receive both voice calls and fax messages. Suppose that 25% of the incoming calls involve fax messages, and consider a sample of 25 incoming calls. What is the probability that
a. At most 6 of the calls involve a fax message?
b. Exactly 6 of the calls involve a fax message?
c. At least 6 of the calls involve a fax message?
d. What is the expected number of calls among the 25 that involve a fax message?
e. What is the standard deviation of the number among the 25 calls that involve a fax message?
f. What is the probability that the number of calls among the 25 that involve a fax transmission exceeds the expected number by more than 2 standard deviations? (Textbook 50-51)
```{r}
pbinom(6, 25, .25)
dbinom(6, 25, .25)
1 - pbinom(5, 25, .25)
E <- 25*.25
Var <- 25*.25*.75
pbinom(floor(E+2*sqrt(Var)), 25, .25)
```
-----
3. **(Hypergeometric)** A geologist has collected 8 specimens of basaltic rock and 12 specimens of granite. The geologist instructs a laboratory assistant to randomly select 15 of the specimens for analysis.
a. What is the probability that all specimens of one of the two types of rock are selected for analysis?
b. What is the probability that the number of granite specimens selected for analysis is within 1 standard deviation of its mean value? (Textbook 71)
```{r}
dhyper(8, 8, 12, 15) + dhyper(3, 8, 12, 15)
mean_g <- 15*12/20
sd_g <- sqrt(20*(12/20)*(8/20))
pbinom(mean_g + sd_g, 15, 12/20)-pbinom(mean_g - sd_g, 15, 12/20)
```
-----
4. **(Negative Binomial)** The probability that a randomly selected box of a certain type of cereal has a particular prize is .2. Suppose you purchase box after box until you have obtained two of these prizes.
a. What is the probability that you purchase $x$ boxes that do not have the desired prize?
b. What is the probability that you purchase four boxes?
c. What is the probability that you purchase at most four boxes?
d. How many boxes without the desired prize do you expect to purchase? How many boxes do you expect to purchase? (Textbook 75, homework)
Let x be the number of boxes without prizes to be purchased, $p = {x+2-1\choose1}(.2^2)(.8 ^x)$
```{r}
dnbinom(2, 2, .2)
dnbinom(2, 2, .2) + dnbinom(1, 2, .2) + dnbinom(0, 2, .2)
2*.8/.2
# 8 boxes without desired prize, 10 boxes in total
```
-----
5. **(Poisson)** Suppose that the number of drivers who travel between a particular origin and destination during a designated time period has a Poisson distribution with parameter $\mu = 20$. What is the probability that the number of drivers will
a. Be at most 10?
b. Be 10?
c. Be within 2 standard deviations of the mean value?
(Textbook 81)
```{r}
ppois(10, 20) # a
dpois(10, 20) # b1
((exp(1)^-20)*(20^10))/factorial(10) # b2
ppois(20 + 2*sqrt(20),20) - ppois(20 - 2*sqrt(20), 20) # c
```