-
Notifications
You must be signed in to change notification settings - Fork 31.3k
/
Copy pathindex.Rmd
144 lines (82 loc) · 4.1 KB
/
index.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
---
title : What is data?
subtitle :
author : Jeffrey Leek
job : Johns Hopkins Bloomberg School of Public Health
logo : bloomberg_shield.png
framework : io2012 # {io2012, html5slides, shower, dzslides, ...}
highlighter : highlight.js # {highlight.js, prettify, highlight}
hitheme : tomorrow #
url:
lib: ../../libraries
assets: ../../assets
widgets : [mathjax] # {mathjax, quiz, bootstrap}
mode : selfcontained # {standalone, draft}
---
```{r setup, cache = F, echo = F, message = F, warning = F, tidy = F}
# make this an external chunk that can be included in any file
options(width = 70)
opts_chunk$set(message = F, error = F, warning = F, echo = F, comment = NA, fig.align = 'center', dpi = 100, tidy = F, cache = T, cache.path = '.cache/', fig.path = 'fig/')
options(xtable.type = 'html')
knit_hooks$set(inline = function(x) {
if(is.numeric(x)) {
round(x, getOption('digits'))
} else {
paste(as.character(x), collapse = ', ')
}
})
knit_hooks$set(plot = knitr:::hook_plot_html)
```
## Definition of data
<q>Data are values of qualitative or quantitative variables, belonging to a set of items.</q>
[http://en.wikipedia.org/wiki/Data](http://en.wikipedia.org/wiki/Data)
---
## Definition of data
<q>Data are values of qualitative or quantitative variables, belonging to a <redtext>set of items</redtext>.</q>
[http://en.wikipedia.org/wiki/Data](http://en.wikipedia.org/wiki/Data)
__Set of items__: Sometimes called the population; the set of objects you are interested in
---
## Definition of data
<q>Data are values of qualitative or quantitative <redtext>variables</redtext>, belonging to a set of items.</q>
[http://en.wikipedia.org/wiki/Data](http://en.wikipedia.org/wiki/Data)
__Variables__: A measurement or characteristic of an item.
---
## Definition of data
<q>Data are values of <redtext>qualitative</redtext> or <redtext>quantitative</redtext> variables, belonging to a set of items.</q>
[http://en.wikipedia.org/wiki/Data](http://en.wikipedia.org/wiki/Data)
__Qualitative__: Country of origin, sex, treatment
__Quantitative__: Height, weight, blood pressure
---
## What do data look like?
<img class=center src="../../assets/img/01_DataScientistToolbox/fastq.png" height=400 />
[http://brianknaus.com/software/srtoolbox/s_4_1_sequence80.txt](http://brianknaus.com/software/srtoolbox/s_4_1_sequence80.txt)
---
## What do data look like?
<img class=center src="../../assets/img/01_DataScientistToolbox/twitter.png" height=400 />
[https://dev.twitter.com/docs/api/1/get/blocks/blocking](https://dev.twitter.com/docs/api/1/get/blocks/blocking)
---
## What do data look like?
<img class=center src="../../assets/img/01_DataScientistToolbox/medicalrecord.png" height=400 />
[http://blue-button.github.com/challenge/](http://blue-button.github.com/challenge/)
---
## What do data look like?
<img class=center src="../../assets/img/01_DataScientistToolbox/deepcat.jpg" height=400 />
[http://www.nytimes.com/2012/06/26/technology/in-a-big-network-of-computers-evidence-of-machine-learning.html?pagewanted=all&_r=0](http://www.nytimes.com/2012/06/26/technology/in-a-big-network-of-computers-evidence-of-machine-learning.html?pagewanted=all&_r=0)
---
## What do data look like?
<img class=center src="../../assets/img/01_DataScientistToolbox/darwintunes.png" height=400 />
[http://www.pnas.org/content/109/30/12081.full](http://www.pnas.org/content/109/30/12081.full)
[https://soundcloud.com/uncoolbob/sets/darwintunes](https://soundcloud.com/uncoolbob/sets/darwintunes)
---
## What do data look like?
<img class=center src="../../assets/img/01_DataScientistToolbox/datagov.png" height=400 />
[http://www.data.gov/](http://www.data.gov/)
-------
## What do data look like? Rarely
<img class=center src="../../assets/img/01_DataScientistToolbox/excel.png" height=400 />
-------
## The data is the second most important thing
* The most important thing in data science is the question
* The second most important is the data
* Often the data will limit or enable the questions
* But having data can't save you if you don't have a question