Skip to content

Commit

Permalink
Rename query types to not use parentheses or commas; fix TimescaleDB …
Browse files Browse the repository at this point in the history
…SQL bug

Parentheses and commas do not play nicely with terminals, so we
replace them with commas. Also, TimescaleDB was generating broken
SQL for multiple hosts (using OR instead of , inside an IN())
  • Loading branch information
RobAtticus committed Jun 4, 2018
1 parent 26bcc59 commit 902ad73
Show file tree
Hide file tree
Showing 5 changed files with 45 additions and 45 deletions.
46 changes: 23 additions & 23 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -129,7 +129,7 @@ Variables needed:
1. the same use case, seed, # of devices, and start time as used in data generation
1. an end time that is one second after the end time from data generation. E.g., for `2016-01-04T00:00:00Z` use `2016-01-04T00:00:01Z`
1. the number of queries to generate. E.g., `1000`
1. and the type of query you'd like to generate. E.g., `single-groupby(1,1,1)`
1. and the type of query you'd like to generate. E.g., `single-groupby-1-1-1`

For the last step there are numerous queries to choose from, which are
listed in [Appendix I](#appendix-i-query-types). Additionally, the file
Expand All @@ -143,16 +143,16 @@ For generating just one set of queries for a given type:
$ tsbs_generate_queries -seed=123 -scale-var=4000 \
-timestamp-start="2016-01-01T00:00:00Z" \
-timestamp-end="2016-01-04T00:00:01Z" \
-queries=1000 -query-type="single-groupby(1,1,1)" -format="timescaledb" \
| gzip > /tmp/timescaledb-queries-single-groupby(1,1,1).gz
-queries=1000 -query-type="single-groupby-1-1-1" -format="timescaledb" \
| gzip > /tmp/timescaledb-queries-single-groupby-1-1-1.gz
```

For generating sets of queries for multiple types:
```bash
$ formats="timescaledb" scaleVar=4000 seed=123 \
tsStart="2016-01-01T00:00:00Z" \
tsEnd="2016-01-04T00:00:01Z" \
queries=1000 queryTypes="single-groupby(1,1,1) single-groupby(1,1,12) double-groupby(1)" \
queries=1000 queryTypes="single-groupby-1-1-1 single-groupby-1-1-12 double-groupby-1" \
dataDir="/tmp" script/generate_queries.sh
```

Expand Down Expand Up @@ -257,10 +257,10 @@ to run multiple query types in a row. The queries it generates should be
put in a file with one query per line and the path given to the script.
For example, if you had a file named `queries.txt` that looked like this:
```text
high-cpu(1)
cpu-max-all(8)
high-cpu-1
cpu-max-all-8
groupby-orderby-limit
double-groupby(1)
double-groupby-1
```

You could generate a run script named `query_test.sh`:
Expand All @@ -275,13 +275,13 @@ And the resulting script file would look like:
```bash
#!/bin/bash
# Queries
cat /tmp/queries/timescaledb-high-cpu(1)-queries.gz | gunzip | query_benchmarker_timescaledb --workers=8 --limit=1000 --hosts="localhost" --postgres="user=postgres sslmode=disable" | tee query_timescaledb_timescaledb-high-cpu(1)-queries.out
cat /tmp/queries/timescaledb-high-cpu-1-queries.gz | gunzip | query_benchmarker_timescaledb --workers=8 --limit=1000 --hosts="localhost" --postgres="user=postgres sslmode=disable" | tee query_timescaledb_timescaledb-high-cpu-1-queries.out

cat /tmp/queries/timescaledb-cpu-max-all(8)-queries.gz | gunzip | query_benchmarker_timescaledb --workers=8 --limit=1000 --hosts="localhost" --postgres="user=postgres sslmode=disable" | tee query_timescaledb_timescaledb-cpu-max-all(8)-queries.out
cat /tmp/queries/timescaledb-cpu-max-all-8-queries.gz | gunzip | query_benchmarker_timescaledb --workers=8 --limit=1000 --hosts="localhost" --postgres="user=postgres sslmode=disable" | tee query_timescaledb_timescaledb-cpu-max-all-8-queries.out

cat /tmp/queries/timescaledb-groupby-orderby-limit-queries.gz | gunzip | query_benchmarker_timescaledb --workers=8 --limit=1000 --hosts="localhost" --postgres="user=postgres sslmode=disable" | tee query_timescaledb_timescaledb-groupby-orderby-limit-queries.out

cat /tmp/queries/timescaledb-double-groupby(1)-queries.gz | gunzip | query_benchmarker_timescaledb --workers=8 --limit=1000 --hosts="localhost" --postgres="user=postgres sslmode=disable" | tee query_timescaledb_timescaledb-double-groupby(1)-queries.out
cat /tmp/queries/timescaledb-double-groupby-1-queries.gz | gunzip | query_benchmarker_timescaledb --workers=8 --limit=1000 --hosts="localhost" --postgres="user=postgres sslmode=disable" | tee query_timescaledb_timescaledb-double-groupby-1-queries.out
```

### Query validation (optional)
Expand All @@ -296,18 +296,18 @@ the results.
### Devops / cpu-only
|Query type|Description|
|:---|:---|
|single-groupby(1,1,1)| Simple aggregrate (MAX) on one metric for 1 host, every 5 mins for 1 hour
|single-groupby(1,1,12)| Simple aggregrate (MAX) on one metric for 1 host, every 5 mins for 12 hours
|single-groupby(1,8,1)| Simple aggregrate (MAX) on one metric for 8 hosts, every 5 mins for 1 hour
|single-groupby(5,1,1)| Simple aggregrate (MAX) on 5 metrics for 1 host, every 5 mins for 1 hour
|single-groupby(5,1,12)| Simple aggregrate (MAX) on 5 metrics for 1 host, every 5 mins for 12 hours
|single-groupby(5,8,1)| Simple aggregrate (MAX) on 5 metrics for 8 hosts, every 5 mins for 1 hour
|cpu-max-all(1)| Aggregate across all CPU metrics per hour over 1 hour for a single host
|cpu-max-all(8)| Aggregate across all CPU metrics per hour over 1 hour for eight hosts
|double-groupby(1)| Aggregate on across both time and host, giving the average of 1 CPU metric per host per hour for 24 hours
|double-groupby(5)| Aggregate on across both time and host, giving the average of 5 CPU metrics per host per hour for 24 hours
|double-groupby(all)| Aggregate on across both time and host, giving the average of all (10) CPU metrics per host per hour for 24 hours
|high-cpu(all)| All the readings where one metric is above a threshold across all hosts
|high-cpu(1)| All the readings where one metric is above a threshold for a particular host
|single-groupby-1-1-1| Simple aggregrate (MAX) on one metric for 1 host, every 5 mins for 1 hour
|single-groupby-1-1-12| Simple aggregrate (MAX) on one metric for 1 host, every 5 mins for 12 hours
|single-groupby-1-8-1| Simple aggregrate (MAX) on one metric for 8 hosts, every 5 mins for 1 hour
|single-groupby-5-1-1| Simple aggregrate (MAX) on 5 metrics for 1 host, every 5 mins for 1 hour
|single-groupby-5-1-12| Simple aggregrate (MAX) on 5 metrics for 1 host, every 5 mins for 12 hours
|single-groupby-5-8-1| Simple aggregrate (MAX) on 5 metrics for 8 hosts, every 5 mins for 1 hour
|cpu-max-all-1| Aggregate across all CPU metrics per hour over 1 hour for a single host
|cpu-max-all-8| Aggregate across all CPU metrics per hour over 1 hour for eight hosts
|double-groupby-1| Aggregate on across both time and host, giving the average of 1 CPU metric per host per hour for 24 hours
|double-groupby-5| Aggregate on across both time and host, giving the average of 5 CPU metrics per host per hour for 24 hours
|double-groupby-all| Aggregate on across both time and host, giving the average of all (10) CPU metrics per host per hour for 24 hours
|high-cpu-all| All the readings where one metric is above a threshold across all hosts
|high-cpu-1| All the readings where one metric is above a threshold for a particular host
|lastpoint| The last reading for each host
|groupby-orderby-limit| The last 5 aggregate readings (across time) before a randomly chosen endpoint
2 changes: 1 addition & 1 deletion cmd/tsbs_generate_queries/databases/timescaledb/devops.go
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ func (d *Devops) getHostWhereWithHostnames(hostnames []string) string {
for _, s := range hostnames {
hostnameClauses = append(hostnameClauses, fmt.Sprintf("'%s'", s))
}
return fmt.Sprintf("tags_id IN (SELECT id FROM tags WHERE hostname IN (%s))", strings.Join(hostnameClauses, " OR "))
return fmt.Sprintf("tags_id IN (SELECT id FROM tags WHERE hostname IN (%s))", strings.Join(hostnameClauses, ","))
} else {
for _, s := range hostnames {
hostnameClauses = append(hostnameClauses, fmt.Sprintf("hostname = '%s'", s))
Expand Down
30 changes: 15 additions & 15 deletions cmd/tsbs_generate_queries/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -23,21 +23,21 @@ import (

var useCaseMatrix = map[string]map[string]utils.QueryFillerMaker{
"devops": {
devops.LabelSingleGroupby + "(1,1,1)": devops.NewSingleGroupby(1, 1, 1),
devops.LabelSingleGroupby + "(1,1,12)": devops.NewSingleGroupby(1, 1, 12),
devops.LabelSingleGroupby + "(1,8,1)": devops.NewSingleGroupby(1, 8, 1),
devops.LabelSingleGroupby + "(5,1,1)": devops.NewSingleGroupby(5, 1, 1),
devops.LabelSingleGroupby + "(5,1,12)": devops.NewSingleGroupby(5, 1, 12),
devops.LabelSingleGroupby + "(5,8,1)": devops.NewSingleGroupby(5, 8, 1),
devops.LabelMaxAll + "(1)": devops.NewMaxAllCPU(1),
devops.LabelMaxAll + "(8)": devops.NewMaxAllCPU(1),
devops.LabelDoubleGroupby + "(1)": devops.NewGroupBy(1),
devops.LabelDoubleGroupby + "(5)": devops.NewGroupBy(5),
devops.LabelDoubleGroupby + "(all)": devops.NewGroupBy(devops.GetCPUMetricsLen()),
devops.LabelGroupbyOrderbyLimit: devops.NewGroupByOrderByLimit,
devops.LabelHighCPU + "(all)": devops.NewHighCPU(0),
devops.LabelHighCPU + "(1)": devops.NewHighCPU(1),
devops.LabelLastpoint: devops.NewLastPointPerHost,
devops.LabelSingleGroupby + "-1-1-1": devops.NewSingleGroupby(1, 1, 1),
devops.LabelSingleGroupby + "-1-1-12": devops.NewSingleGroupby(1, 1, 12),
devops.LabelSingleGroupby + "-1-8-1": devops.NewSingleGroupby(1, 8, 1),
devops.LabelSingleGroupby + "-5-1-1": devops.NewSingleGroupby(5, 1, 1),
devops.LabelSingleGroupby + "-5-1-12": devops.NewSingleGroupby(5, 1, 12),
devops.LabelSingleGroupby + "-5-8-1": devops.NewSingleGroupby(5, 8, 1),
devops.LabelMaxAll + "-1": devops.NewMaxAllCPU(1),
devops.LabelMaxAll + "-8": devops.NewMaxAllCPU(1),
devops.LabelDoubleGroupby + "-1": devops.NewGroupBy(1),
devops.LabelDoubleGroupby + "-5": devops.NewGroupBy(5),
devops.LabelDoubleGroupby + "-all": devops.NewGroupBy(devops.GetCPUMetricsLen()),
devops.LabelGroupbyOrderbyLimit: devops.NewGroupByOrderByLimit,
devops.LabelHighCPU + "-all": devops.NewHighCPU(0),
devops.LabelHighCPU + "-1": devops.NewHighCPU(1),
devops.LabelLastpoint: devops.NewLastPointPerHost,
},
}

Expand Down
2 changes: 1 addition & 1 deletion scripts/generate_queries.sh
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ useJson=${useJson:-false}
useTags=${useTags:-true}

formats=${formats:-"timescaledb"}
queryTypes=${queryTypes:-"single-groupby(1,1,1) single-groupby(1,1,12) single-groupby(1,8,1) single-groupby(5,1,1) single-groupby(5,1,12) single-groupby(5,8,1) double-groupby(1) double-groupby(5) double-groupby(all) cpu-max-all(1) cpu-max-all(8) high-cpu(all) high-cpu(1) lastpoint groupby-orderby-limit"}
queryTypes=${queryTypes:-"single-groupby-1-1-1 single-groupby-1-1-12 single-groupby-1-8-1 single-groupby-5-1-1 single-groupby-5-1-12 single-groupby-5-8-1 double-groupby-1 double-groupby-5 double-groupby-all cpu-max-all-1 cpu-max-all-8 high-cpu-all high-cpu-1 lastpoint groupby-orderby-limit"}

scaleVar=${scaleVar:-"4000"}
queries=${queries:-"1000"}
Expand Down
10 changes: 5 additions & 5 deletions scripts/generate_run_script.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,9 +31,9 @@
EXAMPLE:
queries.txt:
#single-groupby(1,1,1)
single-groupby(5,1,1)
single-groupby(5,1,12)
#single-groupby-1-1-1
single-groupby-5-1-1
single-groupby-5-1-12
Command:
python generate_query_run -d timescaledb -w 8
Expand All @@ -44,9 +44,9 @@
NUM_WORKERS=8 BULK_DATA_DIR=/tmp DATABASE_HOST=localhost BATCH_SIZE=10000 ./load_timescaledb.sh | tee load_timescaledb_8_10000.out
# Queries
cat /tmp/queries/timescaledb-single-groupby(5,1,1)-queries.gz | gunzip | tsbs_run_queries_timescaledb -workers 8 -limit 1000 -postgres "host=localhost user=postgres sslmode=disable" | tee query_timescaledb_timescaledb-single-groupby(5,1,1)-queries.out
cat /tmp/queries/timescaledb-single-groupby-5-1-1-queries.gz | gunzip | tsbs_run_queries_timescaledb -workers 8 -limit 1000 -postgres "host=localhost user=postgres sslmode=disable" | tee query_timescaledb_timescaledb-single-groupby-5-1-1-queries.out
cat /tmp/queries/timescaledb-single-groupby(5,1,12)-queries.gz | gunzip | tsbs_run_queries_timescaledb -workers 8 -limit 1000 -postgres "host=localhost user=postgres sslmode=disable" | tee query_timescaledb_timescaledb-single-groupby(5,1,12)-queries.out
cat /tmp/queries/timescaledb-single-groupby-5-1-12-queries.gz | gunzip | tsbs_run_queries_timescaledb -workers 8 -limit 1000 -postgres "host=localhost user=postgres sslmode=disable" | tee query_timescaledb_timescaledb-single-groupby-5-1-12-queries.out
'''
import argparse
import os
Expand Down

0 comments on commit 902ad73

Please sign in to comment.