-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathfuti-find.qmd
86 lines (50 loc) · 1.96 KB
/
futi-find.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
# Futile Findings (TODO) {#futi-find}
Dirty Dozen p hacking https://psyarxiv.com/xy2dk/
Importance of Stupidity in Scientific Research
https://pubmed.ncbi.nlm.nih.gov/18492790/
Mere Description: https://www.cambridge.org/core/journals/british-journal-of-political-science/article/abs/mere-description/833643C6242D3A45D48BAAC3EF0C33D0
10 rules COVID Pharmacoepi https://pubmed.ncbi.nlm.nih.gov/34393782/
Dirty Dozen: Metric Experimenation Pitfalls in OCE https://exp-platform.com/Documents/2017-08%20KDDMetricInterpretationPitfalls.pdf
AB Testing Intuition Myth Busters https://www.researchgate.net/publication/361226478_AB_Testing_Intuition_Busters_Common_Misunderstandings_in_Online_Controlled_Experiments
Estimands https://arxiv.org/ftp/arxiv/papers/2106/2106.10577.pdf
<!--
## Conditional Probability
## No law to use ALL the data
## Ascribing characteristics at wrong granularity
ecological fallacy
(does this belong in causation chapter?)
## Finding policy-induced relationships
selection bias
## Ignoring heterogeneity
## "If trends continue"
## Analyzing time-to-event data
immortal time bias
## Answering the right question
Don't let available tools dictate the questions of interest
*The Cult of Statistical Significance* [@ziliak_mccloskey]
"Mindless Statistics" [@GIGERENZER2004587]
## Misguided Rigor
```{r}
set.seed(123)
t <- t.test(rnorm(100), rnorm(100))
print(t)
t$p.value
```
```{r}
pvals <- vapply(1:1000, FUN = function(x) t.test(rnorm(100), rnorm(100))$p.value, FUN.VALUE = numeric(1))
alpha <- 0.05
sum(pvals > (1-alpha/2) | pvals < alpha/2) / length(pvals)
```
```{r}
get_prop_sign <- function(n = 1000, alpha = 0.05) {
pvals <- vapply(1:n, FUN = function(x) t.test(rnorm(100), rnorm(100))$p.value, FUN.VALUE = numeric(1))
prop <- sum(pvals > (1-alpha/2) | pvals < alpha/2) / length(pvals)
return(prop)
}
```
Data dredging, p-hacking
## Sample splitting
The **nullabor** [@R-nullabor] R package
## Age Period Cohort
## Strategies
-->