-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathREADME.Rmd
78 lines (56 loc) · 1.93 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
---
output: rmarkdown::github_document
---
# rrecog
Pattern Recognition for Hosts, Services and Content
## Description
'Rapid7' developed a framework dubbed 'Recog' <https://github.com/rapid7/recog> to facilitate fingerprinting hosts, services and content. The original program was written in 'Ruby'. Tools are provided to download and match fingerprints using R.
## What's Inside The Tin
The following functions are implemented:
- `download_fingerprints`: Download fingerprints from the Recog repo
- `load_fingerprints`: Load a directory of fingerprints
- `read_fingerprints_file`: Ingest Recog XML fingerprints from a file and precompile regular expressions
- `recog_match`: Find fingerprint matches for a given source
- `recog_pick`: Find first fingerprint match for a given source
- `use_builtin_fingerprints`: Use built-in fingerprints
## Installation
```{r eval=FALSE}
devtools::install_github("hrbrmstr/rrecog")
```
```{r message=FALSE, warning=FALSE, error=FALSE, include=FALSE}
options(width=120)
```
## Usage
```{r message=FALSE, warning=FALSE, error=FALSE}
library(rrecog)
# current verison
packageVersion("rrecog")
```
### Use Real Data
```{r}
library(httr)
library(tidyverse)
# using the internet as a data source is fraught with peril
safe_GET <- safely(httr::GET)
sprintf(
fmt = "http://%s",
c(
"r-project.org", "pypi.org", "www.mvnrepository.com", "spark.apache.org",
"www.oracle.com", "www.microsoft.com", "www.apple.com", "feedly.com"
)
) -> use_these
pb <- progress_estimated(length(use_these))
map(use_these, ~{
pb$tick()$print()
res <- safe_GET(.x, httr::timeout(2))
if (is.null(res$result)) return(NULL)
res$result$headers$server
}) %>%
compact() %>%
flatten_chr() -> server_headers
server_headers
recog_db <- use_builtin_fingerprints()
map_df(server_headers, ~recog_match(recog_db, .x, "http")) %>%
glimpse() -> found
select(found, orig, service.vendor, service.version, apache.info, description)
```