Add RethinkDB integration #5715

florimondmanca · 2020-02-12T11:22:11Z

What does this PR do?

Add a new integration for RethinkDB.

Work in progress.

Items still TODO:

Later:

Logs.
OTOB dashboard.

Motivation

Allow users to monitor RethinkDB clusters with Datadog.

Additional Notes

To run the check locally:

# Must use --dev to install dependencies (they're not yet in master)
ddev env start --dev rethinkdb py38-2.3
ddev env check rethinkdb py38-2.3

Metadata generation

# Generate metadata.csv
cat ../architecture/rfcs/agent-integrations/rethinkdb.md | python rfc_md_to_metadata_csv.py > rethinkdb/metadata.csv

# Edit `rethinkdb.py` by wrapping `config.collect_metrics(conn)` in `dump_metrics()`.
# Then run a check to dump metrics.
ddev test -pa="tests/test_rethinkdb.py::test_check" rethinkdb:py38-2.3

# Compare submitted metrics with metadata.csv
python validate_metrics.py rethinkdb/metadata.csv rethinkdb/metrics.csv

Click to expand and see source code

rfc_md_to_metadata_csv.py: generate contents of metadata.csv from tables in the ### Metrics section of an RFC.

import csv
import io
import sys

import bs4
import httpx

from datadog_checks.dev.tooling.config import load_config
from datadog_checks.dev.tooling.github import get_auth_info


def markdown_to_html(markdown: str) -> str:
    config = load_config()

    url = "https://api.github.com/markdown"
    payload = {"text": markdown}
    auth = get_auth_info(config)

    response = httpx.post(url, json=payload, auth=auth)
    assert response.status_code == 200, response.json()

    return response.text


def extract_metrics_tables(html: str) -> str:
    soup = bs4.BeautifulSoup(html, 'html.parser')

    anchors = soup.select("h3 a[href='#metrics']")
    assert len(anchors) == 1
    h3 = anchors[0].parent

    tables = []

    node = h3

    while True:
        node = node.next_sibling

        if node is None:
            break

        if isinstance(node, bs4.Tag):
            if node.name == 'h3':
                break
            if node.name == 'table':
                tables.append(str(node))

    return '\n'.join(tables)


def html_table_to_csv(html: str) -> str:
    soup = bs4.BeautifulSoup(html, 'html.parser')
    tables = soup.find_all('table')

    fieldnames = None
    output = io.StringIO()
    writer = csv.writer(output)

    for table in tables:
        head = [th.text.lower().replace(' ', '_') for th in table.select('thead tr th')]

        if fieldnames is None:
            fieldnames = head
            writer.writerow(fieldnames)
        else:
            assert head == fieldnames, f"Table headers mismatch: {head} (expected {fieldnames})"

        rows = [[td.text for td in tr.find_all("td")] for tr in table.select("tbody tr")]
        writer.writerows(rows)

    assert fieldnames is not None

    return output.getvalue()


def rfc_csv_to_metadata_csv(text: str, integration: str) -> str:
    reader = csv.DictReader(text.splitlines())
    assert set(reader.fieldnames).issubset({'name', 'type', 'unit', 'per_unit', 'description'}), reader.fieldnames

    output = io.StringIO()
    fields = (
        'metric_name,metric_type,interval,unit_name,per_unit_name,description,orientation,integration,short_name'
    ).split(',')
    writer = csv.DictWriter(output, delimiter=',', fieldnames=fields)
    writer.writeheader()

    for row in reader:
        row = {
            'metric_name': row['name'],
            'metric_type': row['type'],
            'interval': '',
            'unit_name': row['unit'],
            'per_unit_name': row.get('per_unit'),
            'description': row['description'],
            'orientation': '0',
            'integration': integration,
            'short_name': row['name'].replace(f'{integration}.', '').capitalize().replace('.', ' ').replace('_', ' ')
        }
        writer.writerow(row)

    return output.getvalue()


def main() -> None:
    text = sys.stdin.read()
    text = markdown_to_html(text)
    text = extract_metrics_tables(text)
    text = html_table_to_csv(text)
    text = rfc_csv_to_metadata_csv(text, integration='rethinkdb')
    print(text)


if __name__ == "__main__":
    main()

dump_metrics(): wrap the stream of RethinkDB metrics to write submitted metrics to a CSV file

def dump_metrics(filename, metrics):
    # type: (str, Iterator[Metric]) -> Iterator[Metric]
    import csv

    seen = set()

    with open(filename, 'w') as f:
        writer = csv.DictWriter(f, fieldnames=['name', 'type'])
        writer.writeheader()

        for metric in metrics:
            name = metric['name']
            typ = metric['type']  # type: str

            if typ == 'monotonic_count':
                typ = 'count'

            row = {'name': name, 'type': typ}
            key = (name, typ)

            if typ != 'service_check' and key not in seen:
                writer.writerow(row)
                seen.add(key)

            yield metric

validate_metrics.py: compare metadata.csv with a dump of metrics generated when running a check

import csv
import sys


if __name__ == "__main__":
    metadata_dot_csv, metrics_dot_csv = sys.argv[1:3]

    with open(metadata_dot_csv) as f:
        reader = csv.DictReader(f)
        metadata_metrics = {(row['metric_name'], row['metric_type']) for row in reader}

    with open(metrics_dot_csv) as f:
        reader = csv.DictReader(f)
        metrics = {(row['name'], row['type']) for row in reader}

    assert metrics.issubset(metadata_metrics), metrics - metadata_metrics

Review checklist (to be filled by reviewers)

Feature or bugfix MUST have appropriate tests (unit, integration, e2e)
PR title must be written as a CHANGELOG entry (see why)
Files changes must correspond to the primary purpose of the PR as described in the title (small unrelated changes should have their own PR)
PR must have changelog/ and integration/ labels attached

codecov · 2020-02-12T12:47:27Z

Codecov Report

Merging #5715 into master will not change coverage by %.
The diff coverage is n/a.

* Modifiers -> Transformers * Add docs on `DocumentQuery` parameters and usage. * Add and test an example script for `DocumentQuery`. * Drop hard requirement for a logger on `query.run()`. * Drop trace logs (too noisy to be debug logs).

florimondmanca · 2020-03-25T16:08:44Z

@AlexandreYang Thanks, addressed your feedback, see details in 3b65f86 :-)

rethinkdb/datadog_checks/rethinkdb/document_db/query.py

stale

rethinkdb/datadog_checks/rethinkdb/document_db/query.py

AlexandreYang

LGTM 👍 , thx for all the changes :)

ofek · 2020-03-26T14:29:47Z

lgtm but tests failing

stale

Florimond Manca added 2 commits February 12, 2020 10:57

Add project skeleton

afb654a

Add initial type hints

324b473

github-actions bot added documentation integration/rethinkdb labels Feb 12, 2020

florimondmanca added do-not-merge/WIP integration/rethinkdb kind/new-integration changelog/Added labels Feb 12, 2020

florimondmanca self-assigned this Feb 12, 2020

florimondmanca added the documentation label Feb 12, 2020

Add rethinkdb to test-all-checks.yml

78e79e1

github-actions bot added the dev/testing label Feb 12, 2020

Add short description

2f34ca9

Florimond Manca added 2 commits February 12, 2020 13:53

Rename check class: Rethinkdb -> RethinkDB

1791c1f

Add dependency on rethinkdb (Python driver)

a86ff28

github-actions bot added dependencies integration/datadog_checks_base labels Feb 12, 2020

Florimond Manca added 4 commits February 12, 2020 14:03

Update signature used to instantiate check

fbd6ff2

Use Python 3.8 instead of 3.7 for Tox

0f6a392

Setup Docker Compose

3ae1865

Add E2E test

9d9a6e5

florimondmanca removed kind/new-integration integration/datadog_checks_base dev/testing labels Feb 12, 2020

This was referenced Feb 12, 2020

Switch to Python 3.8 in check integration template #5717

Merged

Switch to Agent 6+ signature in check integration test #5718

Merged

Submit service check

72e9130

github-actions bot added dev/testing integration/datadog_checks_base labels Feb 12, 2020

Address feedback

3b65f86

* Modifiers -> Transformers * Add docs on `DocumentQuery` parameters and usage. * Add and test an example script for `DocumentQuery`. * Drop hard requirement for a logger on `query.run()`. * Drop trace logs (too noisy to be debug logs).

ofek reviewed Mar 26, 2020

View reviewed changes

rethinkdb/datadog_checks/rethinkdb/document_db/query.py Outdated Show resolved Hide resolved

Florimond Manca added 3 commits March 26, 2020 09:40

Merge branch 'master' into florimondmanca/rethinkdb

01e7fae

Refactor passing of logger to queries

b472756

Drop query duration metric

f03beb0

florimondmanca requested review from ofek and AlexandreYang March 26, 2020 10:58

AlexandreYang reviewed Mar 26, 2020

View reviewed changes

rethinkdb/datadog_checks/rethinkdb/document_db/query.py Outdated Show resolved Hide resolved

Address feedback

b9d3e92

AlexandreYang previously approved these changes Mar 26, 2020

View reviewed changes

Florimond Manca added 2 commits March 26, 2020 15:07

Drop detailed jobs metrics in favor of jobs summary

7d962c7

Update metadata.csv

f478cbd

florimondmanca dismissed AlexandreYang’s stale review via f478cbd March 26, 2020 14:11

florimondmanca requested a review from AlexandreYang March 26, 2020 14:17

Fix E2E

a146b41

AlexandreYang previously approved these changes Mar 26, 2020

View reviewed changes

l0k0ms previously approved these changes Mar 27, 2020

View reviewed changes

Florimond Manca added 2 commits March 27, 2020 11:46

Merge branch 'master' into florimondmanca/rethinkdb

bd4746d

Fix metric_to_check

8eee355

florimondmanca dismissed stale reviews from l0k0ms and AlexandreYang via 8eee355 March 27, 2020 13:19

ofek approved these changes Mar 27, 2020

View reviewed changes

AlexandreYang approved these changes Mar 27, 2020

View reviewed changes

florimondmanca merged commit c38938d into master Mar 27, 2020

florimondmanca deleted the florimondmanca/rethinkdb branch March 27, 2020 15:29

florimondmanca mentioned this pull request Dec 4, 2020

Refactor to use QueryManager #8143

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add RethinkDB integration #5715

Add RethinkDB integration #5715

florimondmanca commented Feb 12, 2020 •

edited

Loading

codecov bot commented Feb 12, 2020 •

edited

Loading

florimondmanca commented Mar 25, 2020

AlexandreYang left a comment

ofek commented Mar 26, 2020

Add RethinkDB integration #5715

Add RethinkDB integration #5715

Conversation

florimondmanca commented Feb 12, 2020 • edited Loading

What does this PR do?

Motivation

Additional Notes

Metadata generation

Review checklist (to be filled by reviewers)

codecov bot commented Feb 12, 2020 • edited Loading

Codecov Report

florimondmanca commented Mar 25, 2020

AlexandreYang left a comment

Choose a reason for hiding this comment

ofek commented Mar 26, 2020

florimondmanca commented Feb 12, 2020 •

edited

Loading

codecov bot commented Feb 12, 2020 •

edited

Loading