Skip to content

Commit

Permalink
Add coredns prometheus health and ksm cronjob sc (#3903)
Browse files Browse the repository at this point in the history
* Add coredns prometheus health and ksm cronjob sc

* Add service checks in READMEs

* Markdown for critical - coredns
  • Loading branch information
David Bouchare authored and FlorianVeaux committed Jun 18, 2019
1 parent 8827b8d commit c17f207
Show file tree
Hide file tree
Showing 4 changed files with 27 additions and 2 deletions.
4 changes: 3 additions & 1 deletion coredns/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,9 @@ The CoreDNS check does not include any events.

### Service Checks

The CoreDNS check does not include any service checks.
`coredns.prometheus.health`:

Returns `CRITICAL` if the Agent cannot reach the metrics endpoints.

## Troubleshooting

Expand Down
12 changes: 11 additions & 1 deletion coredns/assets/service_checks.json
Original file line number Diff line number Diff line change
@@ -1 +1,11 @@
[]
[
{
"agent_version": "6.11.0",
"integration":"coredns",
"check": "coredns.prometheus.health",
"statuses": ["ok", "critical"],
"groups": ["endpoint"],
"name": "CoreDNS prometheus health",
"description": "Returns `CRITICAL` if the check cannot access the metrics endpoint. Returns `OK` otherwise."
}
]
4 changes: 4 additions & 0 deletions kubernetes_state/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,10 @@ Returns `OK` otherwise.
Returns `CRITICAL` if a cluster node is in a network unavailable state.
Returns `OK` otherwise.

**kubernetes_state.cronjob.next_schedule_time**
Returns `CRITICAL` if a cron job does not have a next scheduled time for execution.
Returns `OK` otherwise.

## Troubleshooting
Need help? Contact [Datadog support][6].

Expand Down
9 changes: 9 additions & 0 deletions kubernetes_state/assets/service_checks.json
Original file line number Diff line number Diff line change
Expand Up @@ -43,5 +43,14 @@
"groups": ["host", "node"],
"name": "Node Network Unavailable",
"description": "Returns `CRITICAL` if a cluster node is in a network unavailable state. Returns `UNKNOWN` if status is unknown. Returns `OK` otherwise."
},
{
"agent_version": "5.6.0",
"integration":"kubernetes",
"check": "kubernetes_state.cronjob.next_schedule_time",
"statuses": ["ok", "unknown", "critical"],
"groups": ["host", "node"],
"name": "CronJob next scheduled time",
"description": "Returns `CRITICAL` if a cron job does not have a next scheduled time for execution. Returns `UNKNOWN` if the scheduled time is unknown. Returns `OK` otherwise."
}
]

0 comments on commit c17f207

Please sign in to comment.