-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Escalations #2
Comments
This is definitely on the to-do list and near the top. Looking at the model, you have a rotation that's defined by:
Separate from the rotation itself, you have an escalation policy that defines what to do if the primary doesn't answer:
Also need to figure out when you escalate. This should probably be selectable:
|
When talking about on-call rotations and escalations people tend to get really creative. I'd suggest to try to keep things simple and first introduce an team API. A team would have a name and an (ordered) list of members (people). This team API would provide an notify endpoint like the people API does. Triggering this endpoint would signal each member of the team in order until someone acknowledges the notification (hopefully the first one). This way teams would still be optional and rather flexible. When implementing https://github.com/dominikschulz/Monitoring-Spooler we chose to not handle automatic shift changes. Instead we let the new operator taking over the shift initiate the change. This way the whole "type of shift" logic would be unnecessary. |
So now we have a Team API. Now I need to figure out how I want to implement the notification engine for teams. It will probably run in its own goroutine similar to StartNotificationEngine() in notification.go. The current StartNotificationEngine() function may get renamed to something like StartPeopleNotificationEngine() and then we have a StartTeamNotificationEngine() which takes notification requests for teams and notifies/escalates as appropriate. |
I am currently looking into different ways of implementing the team notifications. I will open up another PR when and if I have something that looks good to me. |
I'm going to work on the implementation tonight but feel free to send PRs if you want. I had a thought last night: the JSON is submitted for team creation is huge and ugly because we are including the rotation and escalation details in there. I think that these should probably be broken into their own separate API (like I did for notification steps). So, we have the Team struct (minus Rotation and EscalationSteps struct members) and then a Rotation struct and an EscalationSteps struct. They could each have their own API if it makes sense. It's a lot of extra code and docs but it may simplify things from the client perspective if we avoid having the client have to create some massive, multi-level JSON struct to set up teams and rotations. |
I think that StartNotificationEngine() could be enhanced to watch for team notifications in addition to the person notifications that it currently gets through planChan. We would add an additional case statement to receive team notification events. When this case was called, it would act similarly to the <-planChan case: it would set up a stopper channel and launch a goroutine to handle the team's notification/escalation plan. |
So I'm in the middle of breaking out escalations and rotations into their own APIs. My new branch introduces the concept of an EscalationPlan, which is very similar to a Person's NotificationPlan, except that it applies to teams. One thing we need to think about is the interaction between the execution of the escalation plan and the individual notification plans. Right now, Chicken Little notification plans just execute until acknowledged. Soon, we will add team escalation plans, which constitute a series of individual notification plans. We will need a way for those individual notifications to signal up to the team escalation plan to stop because somebody has acknowledged the alert. I propose that we assign team alerts a UUID just like we assign them to individual alerts. When a team escalation executes an individual alert as part of the escalation plan, the team alert UUID will be passed as a parameter to the individual alert. If the individual acknowledges their alert, the individual alert executor sends the team alert UUID back to the notification engine so that the team alert can be stopped. |
As discussed in chrissnell#2 this commit removes the team logic from master until it's completely finished and gets merged again.
With PR #13 this branch should be more or less feature complete (wrt. team escalations). @chrissnell While there are still some missing tests and documentation I`d like to hear your opinion on the current state of this branch. Do you miss anything important or what state should we aim for before merging this into master? |
@chrissnell With everything I feel necessary in the teamescalation branch I'd like to start discussing what is missing to merge this in to master. |
Sorry it's taken me so long to respond. I've been traveling and then work got busy. If you're ready to merge this, please do--you have commit rights. We do need some documentation, however, on how teams and team escalation works. I also want to starting work on a "Quick Start" guide, but that's not necessary to merge your branch. |
Never mind, I just wanted to wait for feedback before merging this into master. Unless I find any blocking issues in the next days, I'll merge the teamescalation branch to master. |
Hi Dominik, do you still want to merge this? |
Hi, I was pretty busy lately. Right now I can't say if I can put any more effort into this project or not. If it's OK for I'd suggest to leave this PR open until I make up my mind. Otherwise feel free to close it. |
Please consider supporting groups of people in which notifications may be escalated.
This would allow several people to be on call for a given topic. If one fails to respond to the notifications the next one would be notified and so on.
It should be pretty easy to implement by adding an team API and wrapping the notification loop in another loop iterating over the team members.
The text was updated successfully, but these errors were encountered: