Support Escalations #2

dominikschulz · 2015-06-10T13:33:51Z

Please consider supporting groups of people in which notifications may be escalated.

This would allow several people to be on call for a given topic. If one fails to respond to the notifications the next one would be notified and so on.

It should be pretty easy to implement by adding an team API and wrapping the notification loop in another loop iterating over the team members.

chrissnell · 2015-06-10T19:34:18Z

This is definitely on the to-do list and near the top. Looking at the model, you have a rotation that's defined by:

type of shift:
- full day shift (rotation no sooner than 24 hours after shift start)
  - frequency of rotation (e.g. 1 week)
  - day and time of rotation (e.g. every Monday at 0800)
- partial day shift (for rotations < 24 hours apart)
  - frequency of rotation (e.g. 12 hours)
- something else (user A covers weekdays, user B covers weekends)

Separate from the rotation itself, you have an escalation policy that defines what to do if the primary doesn't answer:

Escalate to a backup (if so, who?)
Page everyone
Execute webhook
Send email

Also need to figure out when you escalate. This should probably be selectable:

After N step of the notification plan fails to get a response
After all steps of a notification plan have been executed at least once with no response
After N minutes

dominikschulz · 2015-06-11T15:54:10Z

When talking about on-call rotations and escalations people tend to get really creative. I'd suggest to try to keep things simple and first introduce an team API. A team would have a name and an (ordered) list of members (people).

This team API would provide an notify endpoint like the people API does. Triggering this endpoint would signal each member of the team in order until someone acknowledges the notification (hopefully the first one).

This way teams would still be optional and rather flexible. When implementing https://github.com/dominikschulz/Monitoring-Spooler we chose to not handle automatic shift changes. Instead we let the new operator taking over the shift initiate the change. This way the whole "type of shift" logic would be unnecessary.

chrissnell · 2015-06-18T06:48:27Z

So now we have a Team API. Now I need to figure out how I want to implement the notification engine for teams. It will probably run in its own goroutine similar to StartNotificationEngine() in notification.go. The current StartNotificationEngine() function may get renamed to something like StartPeopleNotificationEngine() and then we have a StartTeamNotificationEngine() which takes notification requests for teams and notifies/escalates as appropriate.

dominikschulz · 2015-06-18T13:27:48Z

I am currently looking into different ways of implementing the team notifications. I will open up another PR when and if I have something that looks good to me.

chrissnell · 2015-06-18T18:46:28Z

I'm going to work on the implementation tonight but feel free to send PRs if you want. I had a thought last night: the JSON is submitted for team creation is huge and ugly because we are including the rotation and escalation details in there. I think that these should probably be broken into their own separate API (like I did for notification steps). So, we have the Team struct (minus Rotation and EscalationSteps struct members) and then a Rotation struct and an EscalationSteps struct. They could each have their own API if it makes sense. It's a lot of extra code and docs but it may simplify things from the client perspective if we avoid having the client have to create some massive, multi-level JSON struct to set up teams and rotations.

chrissnell · 2015-06-19T03:55:48Z

I think that StartNotificationEngine() could be enhanced to watch for team notifications in addition to the person notifications that it currently gets through planChan. We would add an additional case statement to receive team notification events. When this case was called, it would act similarly to the <-planChan case: it would set up a stopper channel and launch a goroutine to handle the team's notification/escalation plan.

chrissnell · 2015-06-19T05:05:59Z

So I'm in the middle of breaking out escalations and rotations into their own APIs. ~~I will push the new branch work-in-progress to GH shortly.~~ New branch here.

My new branch introduces the concept of an EscalationPlan, which is very similar to a Person's NotificationPlan, except that it applies to teams. One thing we need to think about is the interaction between the execution of the escalation plan and the individual notification plans. Right now, Chicken Little notification plans just execute until acknowledged. Soon, we will add team escalation plans, which constitute a series of individual notification plans. We will need a way for those individual notifications to signal up to the team escalation plan to stop because somebody has acknowledged the alert.

I propose that we assign team alerts a UUID just like we assign them to individual alerts. When a team escalation executes an individual alert as part of the escalation plan, the team alert UUID will be passed as a parameter to the individual alert. If the individual acknowledges their alert, the individual alert executor sends the team alert UUID back to the notification engine so that the team alert can be stopped.

As discussed in chrissnell#2 this commit removes the team logic from master until it's completely finished and gets merged again.

dominikschulz · 2015-07-16T08:31:35Z

With PR #13 this branch should be more or less feature complete (wrt. team escalations).

@chrissnell While there are still some missing tests and documentation I`d like to hear your opinion on the current state of this branch. Do you miss anything important or what state should we aim for before merging this into master?

dominikschulz · 2015-08-14T05:06:44Z

@chrissnell With everything I feel necessary in the teamescalation branch I'd like to start discussing what is missing to merge this in to master.

chrissnell · 2015-08-29T22:21:59Z

Sorry it's taken me so long to respond. I've been traveling and then work got busy. If you're ready to merge this, please do--you have commit rights. We do need some documentation, however, on how teams and team escalation works. I also want to starting work on a "Quick Start" guide, but that's not necessary to merge your branch.

dominikschulz · 2015-08-30T19:19:28Z

Never mind, I just wanted to wait for feedback before merging this into master.

Unless I find any blocking issues in the next days, I'll merge the teamescalation branch to master.

chrissnell · 2015-10-18T21:53:24Z

Hi Dominik, do you still want to merge this?

dominikschulz · 2016-09-14T15:19:17Z

Hi, I was pretty busy lately. Right now I can't say if I can put any more effort into this project or not.

If it's OK for I'd suggest to leave this PR open until I make up my mind. Otherwise feel free to close it.

dominikschulz mentioned this issue Jun 11, 2015

Add team API #5

Merged

dominikschulz mentioned this issue Jun 21, 2015

Refactor Notification Engine for Escalations #9

Closed

dominikschulz added a commit to dominikschulz/chickenlittle that referenced this issue Jul 1, 2015

Remove teams

a7a03be

As discussed in chrissnell#2 this commit removes the team logic from master until it's completely finished and gets merged again.

dominikschulz mentioned this issue Jul 1, 2015

Remove teams #11

Merged

dominikschulz mentioned this issue Jul 16, 2015

Refactor controller #13

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Escalations #2

Support Escalations #2

dominikschulz commented Jun 10, 2015

chrissnell commented Jun 10, 2015

dominikschulz commented Jun 11, 2015

chrissnell commented Jun 18, 2015

dominikschulz commented Jun 18, 2015

chrissnell commented Jun 18, 2015

chrissnell commented Jun 19, 2015

chrissnell commented Jun 19, 2015

dominikschulz commented Jul 16, 2015

dominikschulz commented Aug 14, 2015

chrissnell commented Aug 29, 2015

dominikschulz commented Aug 30, 2015

chrissnell commented Oct 18, 2015

dominikschulz commented Sep 14, 2016

Support Escalations #2

Support Escalations #2

Comments

dominikschulz commented Jun 10, 2015

chrissnell commented Jun 10, 2015

dominikschulz commented Jun 11, 2015

chrissnell commented Jun 18, 2015

dominikschulz commented Jun 18, 2015

chrissnell commented Jun 18, 2015

chrissnell commented Jun 19, 2015

chrissnell commented Jun 19, 2015

dominikschulz commented Jul 16, 2015

dominikschulz commented Aug 14, 2015

chrissnell commented Aug 29, 2015

dominikschulz commented Aug 30, 2015

chrissnell commented Oct 18, 2015

dominikschulz commented Sep 14, 2016