Skip to content

Latest commit

 

History

History
232 lines (167 loc) · 10.5 KB

README.md

File metadata and controls

232 lines (167 loc) · 10.5 KB


Hellper - Your best friend in times of crisis
Hellper - Your best friend in times of crisis

Hellper bot aims to orchestrate the process and resolution of incidents, reducing the time spent with manual tasks and ensuring that the necessary steps are fulfilled in the right order. Also, it facilitates the measurement of impact and response rate through metrics.

A chance to help explore and develop a bot written in Go, integrated with multiple external platforms and tools.

Help us expand incident processes’ and understand the needs of other companies that may benefit from Hellper bot.

You’re just one PR away from joining the developing team of Hellper! Contribute

CircleCI Dependabot Status PRs welcome! License


Contents

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.

Prerequisites

  1. Docker Compose
  2. Slack Account
  3. G Suite Account

Installing

  1. Clone this repo
git clone git@github.com:ResultadosDigitais/hellper.git
  1. Configure Slack
  2. Configure Google
  3. Make a copy from configuration example
cp development.env.example development.env

Variables explanation

Variable Explanation Default value
HELLPER_BIND_ADDRESS Hellper local bind address :8080
HELLPER_DATABASE Database provider (supported values: postgres) postgres
HELLPER_DSN Your Data Source Name ---
HELLPER_ENVIRONMENT Current environment (supported values: production, staging) ---
HELLPER_GOOGLE_CREDENTIALS Google Credentials ---
HELLPER_GOOGLE_DRIVE_TOKEN Google Drive Token
HELLPER_GOOGLE_DRIVE_FILE_ID Google Drive FileId to your post-mortem template ---
HELLPER_GOOGLE_CALENDAR_TOKEN Google Calendar Token
HELLPER_GOOGLE_CALENDAR_ID Google Calendar Id to schedule your post-mortem
HELLPER_POSTMORTEM_GAP_DAYS Gap in days between resolve and postmortem event, by dafault the gap is 5 days if there is no variable 5
HELLPER_MATRIX_HOST Matrix URL host ---
HELLPER_PRODUCT_CHANNEL_ID The Product channel id used to notify new incidents ---
HELLPER_NOTIFY_ON_RESOLVE Notify the Product channel when resolve the incident true
HELLPER_NOTIFY_ON_CLOSE Notify the Product channel when close the incident true
HELLPER_NOTIFY_ON_CANCEL Notify the Product channel when cancel the incident true
HELLPER_SUPPORT_TEAM Support team identifier to notify ---
HELLPER_PRODUCT_LIST List of all products splitted by semicolon Product A;Product B;Product C;Product D
HELLPER_REMINDER_OPEN_STATUS_SECONDS Contains the time for the stat reminder to be triggered in open incidents, by default the time is 2 hours if there is no variable 7200
HELLPER_REMINDER_RESOLVED_STATUS_SECONDS Contains the time for the stat reminder to be triggered in resolved incidents, by default the time is 24 hours if there is no variable 86400
HELLPER_REMINDER_OPEN_NOTIFY_MSG Notify message when status is open Incident Status: Open - Update the status of this incident, just pin a message with status on the channel.
HELLPER_REMINDER_RESOLVED_NOTIFY_MSG Notify message when status is resolved Incident Status: Resolved - Update the status of this incident, just pin a message with status on the channel.
HELLPER_OAUTH_TOKEN Slack token to exeucte bot user actions ---
HELLPER_SLACK_SIGNING_SECRET Slack token to verify external requests ---
FILE_STORAGE Hellper file storage for postmortem document google_drive
TIMEZONE Timezone for Post Mortem Meeting America/Sao_Paulo
HELLPER_SLA_HOURS_TO_CLOSE Number of hours between the incident resolution and Hellper reminder to close the incident. 168

Running the Tests

  1. make test

Running the application

  1. make run

Deployment

Deploy

Setup database

  • Run this command and copy the address:

heroku config:get DATABASE_URL

  • Run this command and past it on the YOUR_DATABASE_URL:

heroku config:set HELLPER_DSN=YOUR_DATABASE_URL

  • Import the scheema changing YOUR_HEROKU_APP_NAME by your application name:

heroku pg:psql --app YOUR_HEROKU_APP_NAME < internal/model/sql/postgres/schema/hellper.sql

Optional Setup

Ngrok (To receive events from Slack)

Golang

OR

  1. Install gvm
  2. Follow gvm post install instructions
  3. Install go 1.14 as default

Database

psql $HELLPER_DSN -f "./internal/model/sql/postgres/schema/hellper.sql"

How to use

Commands

After Configuring Slack you can use the commands created. The commands are as it follows:

Command Short Description
/hellper_incident Starts Incident
/hellper_status Show all pinned messages
/hellper_close Closes Incident
/hellper_resolve Resolves Incident
/hellper_cancel Cancels Incident
/hellper_pause_notify Pauses incident notification
/hellper_update_dates Updates the dates for an incident

The first command /hellper_incident can be use at any channel and/or conversation on Slack. It will open a pop-up for the user to set and start an Incident, creating the channel, meeting room link and post-mortem doc.

The remaining commands must be used only on the Incident's channel since they act on the specific incident that is open.

Metrics

This metrics came from metrics view table, they are calculated by the following formulas:

Metric Description Formula
start_ts Date and time when the incident is started Date and time in UTC from db
identification_ts Date and time when the incident is identified Date and time in UTC from db
end_ts Date and time when the incident is resolved Date and time in UTC from db
acknowledgetime Time To Acknowledge identification_ts - start_ts
solutiontime Time To Solution end_ts - identification_ts
downtime Time in an incident end_ts - start_ts
MTTA Mean Time To Acknowledge total acknowledgetime / total incidents
MTTS Mean Time To Solution total solutiontime / total incidents
MTTR Mean Time To Recovery total downtime / total incidents

Alerts

Alerts are useful for notifying the status of incidents. Notifications can be used in different situations, such as requesting an update of the incident status or finalizing the post-mortem. For this you can use the CLI notify on your CronJob service.

To use the CLI you need to build the binary file:

go build -o notify cmd/notify/main.go

Example of use

SHELL=/bin/bash
BASH_ENV=/app/.env

# At 16:30 on every week-day, from Monday through Friday, it sends a report to a selected channel with all incidents not closed
30 16 * * 1-5 root /app/notify --type=report --to=YOUR_SLACK_CHANNEL_ID --status=all

#  Every 30th minute it sends a status update request alert for all open incidents
0/30 * * * * root /app/notify --type=channels --status=open

# At 13:30 on every week-day, from Monday through Friday, sends a post-mortem request alert for all resolved incidents
30 13 * * 1-5 root /app/notify --type=channels --status=resolved

Contributing

Thanks for being interested in contributing! We’re so glad you want to help! Please take a little bit of your time and look at our contributing guidelines. All type of contributions are welcome, such as bug fixes, issues or feature requests.

Code of Conduct

Everyone interacting in the Hellper project’s codebases, issue trackers, chat rooms and mailing lists is expected to follow the code of conduct.

Need help?

If you need help with Hellper, feel free to open an issue with a description of the problem you're facing.

License

The Hellper is available as open source under the terms of the MIT License.