forked from timescale/tsbs
-
Notifications
You must be signed in to change notification settings - Fork 6
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add docs/ dir for supplemental doc material
Each database should have a supplemental doc explaining important details about how the data is generated, perhaps how it is stored, and additional flags for its client binaries. Also, this removes any unnecessary flags from binaries.
- Loading branch information
1 parent
86646ac
commit 3869237
Showing
8 changed files
with
444 additions
and
20 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,88 @@ | ||
# TSBS Supplemental Guide: Cassandra | ||
|
||
Cassandra is a general column store database. This supplemental guide explains | ||
how the data generated for TSBS is stored, additional flags available when | ||
using the data importer (`tsbs_load_cassandra`), and additional flags | ||
available for the query runner (`tsbs_run_queries_cassandra`). **This | ||
should be read *after* the main README.** | ||
|
||
## Data format | ||
|
||
Data generated by `tsbs_generate_data` for Cassandra is a "pseudo-CSV" format. | ||
Each reading is a single line where the first comma-separated element with | ||
the following elements: | ||
* first, the table the reading belongs to (based on data type, e.g., `series_double` for doubles); | ||
* then, the data source (e.g., `cpu` for `cpu-only`); | ||
* then, several elements of the form `<label>=<value>` for tags; | ||
* then, the field label; | ||
* then, the date of the reading in YYYY-MM-DD form; | ||
* then, the timestamp in nanoseconds; | ||
* and finally, the reading itself. | ||
|
||
An example from `cpu-only`: | ||
```text | ||
series_double,cpu,hostname=host_0,region=eu-west-1,datacenter=eu-west-1b,rack=67,os=Ubuntu16.10,arch=x86,team=NYC,service=7,service_version=0,service_environment=production,usage_guest_nice,2016-01-01,1451606400000000000,38.2431182911542820 | ||
``` | ||
|
||
When stored, the elements starting with the data source (e.g. `cpu`) through | ||
the date of the reading are concatenated to serve as the primary key. | ||
|
||
--- | ||
|
||
## `tsbs_load_cassandra` Additional Flags | ||
|
||
### Database related | ||
|
||
#### `-consistency` (type: `string`, default: `ALL`) | ||
|
||
Consistency level for writes to the database. Options are `ALL`, `ANY`, `ONE`, | ||
`TWO`, `THREE`, or `QUORUM`. Applies for multi-node cluster. | ||
|
||
#### `-hosts` (type: `string`, default: `localhost:9042`) | ||
|
||
Comma-separated list of hostname and port combinations for nodes in the cluster. | ||
|
||
#### `-replication-factor` (type: `int`, default: `1`) | ||
|
||
Level of replication for each write, i.e., number of nodes to store the | ||
data on. Only applies a multi-node cluster. | ||
|
||
#### `-write-timeout` (type: `duration`, default: `10s`) | ||
|
||
Length of the timeout for writes. | ||
It is expressed as a Golang time.Duration string, meaning a number followed | ||
by a unit abbreviation (s = seconds, | ||
m = minutes, h = hours), e.g., the default `10s` is ten seconds. | ||
|
||
|
||
--- | ||
|
||
## `tsbs_run_queries_cassandra` Additional Flags | ||
|
||
### Database related | ||
|
||
#### `-aggregation-plan` (type: `string`, default: `client`) | ||
|
||
Method for doing aggregations in queries. Due to limitations in Cassandra's | ||
SQL-like language CQL, aggregations can be painful and slow if done on the | ||
server itself. Therefore the default is `client` (with the other valid option | ||
being `server`), where the client Go program handles the aggregation. | ||
|
||
#### `-client-side-index-timeout` (type: `duration`, default: `10s`) | ||
|
||
Length of the timeout when setting up the client side index, a data structure | ||
used to speed up queries by storing the tagsets/primary keys in memory on the | ||
client. It is expressed as a Golang time.Duration string, meaning a number followed by a unit abbreviation (s = seconds, | ||
m = minutes, h = hours), e.g., the default `10s` is ten seconds. | ||
|
||
#### `-host` (type: `string`, default: `localhost:9042`) | ||
|
||
Hostname and port combination of at least one node in the cluster. The library | ||
used will discover the other nodes for queries. | ||
|
||
#### `-read-timeout` (type: `duration`, default: `10s`) | ||
|
||
Length of the timeout for reads. | ||
It is expressed as a Golang time.Duration string, meaning a number followed | ||
by a unit abbreviation (s = seconds, | ||
m = minutes, h = hours), e.g., the default `10s` is ten seconds. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,83 @@ | ||
# TSBS Supplemental Guide: InfluxDB | ||
|
||
InfluxDB is a purpose-built time-series database written in Go from | ||
InfluxData. This supplemental guide explains how | ||
the data generated for TSBS is stored, additional flags available when | ||
using the data importer (`tsbs_load_influx`), and additional flags | ||
available for the query runner (`tsbs_run_queries_influx`). **This | ||
should be read *after* the main README.** | ||
|
||
## Data format | ||
|
||
Data generated by `tsbs_generate_data` for InfluxDB is serialized in a | ||
"pseudo-CSV" format. Each reading is composed of a single line where | ||
the name of the table is the first item, followed by several items of | ||
tags and fields that are in the format of `<label>=<value>`, and finally | ||
a space and then the timestamp for the reading. | ||
|
||
An example for the `cpu-only` use case: | ||
```text | ||
cpu,hostname=host_0,region=eu-central-1,datacenter=eu-central-1b,rack=21,os=Ubuntu15.10,arch=x86,team=SF,service=6,service_version=0,service_environment=test usage_user=58.1317132304976170,usage_system=2.6224297271376256,usage_idle=24.9969495069947882,usage_nice=61.5854484633778867,usage_iowait=22.9481393231639395,usage_irq=63.6499207106198313,usage_softirq=6.4098777048301052,usage_steal=44.8799140503027445,usage_guest=80.5028770761136201,usage_guest_nice=38.2431182911542820 1451606400000000000 | ||
``` | ||
|
||
--- | ||
|
||
## `tsbs_load_influx` Additional Flags | ||
|
||
### Database related | ||
|
||
#### `-consistency` (type: `string`, default: `all`) | ||
|
||
Consistency level for writes to the database. Options are `all`, `any`, `one`, | ||
or `quorum`. Only applies for the clustered version. | ||
|
||
#### `-do-abort-on-exist` (type: `boolean`, default: `true`) | ||
|
||
Whether to abort the benchmark if the database named already exists. This is to | ||
prevent accidentally overwriting a database of the same name or a previous run | ||
of the benchmark. | ||
|
||
#### `-replication-factor` (type: `int`, default: `1`) | ||
|
||
Level of replication for each write, i.e., number of nodes to store the | ||
data on. Only applies for the clustered version. | ||
|
||
#### `-urls` (type: `string`, default: `http://localhost:8086`) | ||
|
||
Comma-separated list of URLs to connect to for inserting data. Workers will be | ||
distributed in a round robin fashion across the URLs. | ||
|
||
### Miscellaneous | ||
|
||
#### `-backoff` (type: `duration`, default: `1s`) | ||
|
||
The amount of time per retry attempt when the server says it is too busy. A | ||
longer backoff will potentially reduce write performance by waiting too long to | ||
retry, leaving the system idle. It is expressed as a Golang time.Duration | ||
string, meaning a number followed by a unit abbreviation (s = seconds, | ||
m = minutes, h = hours), e.g., the default `1s` is one second. | ||
|
||
#### `-gzip` (type: `boolean`, default: `true`) | ||
|
||
Whether to encode writes to the server with gzip. For best performance, encoding | ||
with gzip is the best choice, but if the server does not support or has gzip | ||
disabled, this flag should be set to false. | ||
|
||
--- | ||
|
||
## `tsbs_run_queries_influx` Additional Flags | ||
|
||
### Database related | ||
|
||
#### `-chunk-response-size` (type: `int`, default: `0`) | ||
|
||
Number of series to return per response per query. If the query would generate | ||
a response that is very large, it could cause the server to crash with | ||
out-of-memory problems. This flag will chunk the response into multiple smaller | ||
responses to prevent the server from crashing. The default of 0 will return | ||
everything in a single response. | ||
|
||
#### `-urls` (type: `string`, default: `http://localhost:8086`) | ||
|
||
Comma-separated list of URLs to connect to for querying. Workers will be | ||
distributed in a round robin fashion across the URLs. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,80 @@ | ||
# TSBS Supplemental Guide: MongoDB | ||
|
||
MongoDB is a general NoSQL database that stores data as JSON-like documents. | ||
This supplemental guide explains how the data generated for TSBS is stored, additional flags available when | ||
using the data importer (`tsbs_load_mongo`), and additional flags | ||
available for the query runner (`tsbs_run_queries_mongo`). **This | ||
should be read *after* the main README.** | ||
|
||
## Data format | ||
|
||
Data generated by `tsbs_generate_data` for MongoDB is serialized as a | ||
FlatBuffer to represent each reading. This format is not (easily) human readable | ||
in its serialized format, however the FlatBuffer is specified as follows: | ||
```text | ||
// mongo.fbs | ||
namespace serialize; | ||
table MongoTag { | ||
key:string; | ||
value:string; | ||
} | ||
table MongoReading { | ||
key:string; | ||
value:double; | ||
} | ||
table MongoPoint { | ||
measurementName:string; | ||
timestamp:long; | ||
tags:[MongoTag]; | ||
fields:[MongoReading]; | ||
} | ||
root_type MongoPoint; | ||
``` | ||
|
||
--- | ||
|
||
## `tsbs_load_mongo` Additional Flags | ||
|
||
### Database related | ||
|
||
#### `-url` (type: `string`, default: `localhost:27017`) | ||
|
||
URL for connecting to the MongoDB server daemon. | ||
|
||
#### `-write-timeout` (type: `duration`, default: `10s`) | ||
|
||
Length of the timeout for writes. | ||
It is expressed as a Golang time.Duration string, meaning a number followed | ||
by a unit abbreviation (s = seconds, | ||
m = minutes, h = hours), e.g., the default `10s` is ten seconds. | ||
|
||
|
||
### Miscellaneous | ||
|
||
#### `-document-per-event` (type: `boolean`, default: `false`) | ||
|
||
Store each data reading as a separate document instead of the default aggregated | ||
format. The default aggregated format stores an hour's worth of readings for | ||
a particular device in one document and uses updates for a more efficient | ||
storage model. However for testing or comparing, this flag is provided to use | ||
a model where each data reading is stored as a single document. | ||
|
||
--- | ||
|
||
## `tsbs_run_queries_mongo` Additional Flags | ||
|
||
### Database related | ||
|
||
#### `-url` (type: `string`, default: `localhost:27017`) | ||
|
||
URL for connecting to the MongoDB server daemon. | ||
|
||
#### `-read-timeout` (type: `duration`, default: `10s`) | ||
|
||
Length of the timeout for reads. | ||
It is expressed as a Golang time.Duration string, meaning a number followed | ||
by a unit abbreviation (s = seconds, | ||
m = minutes, h = hours), e.g., the default `10s` is ten seconds. |
Oops, something went wrong.