Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add retry on Delete silence #5794

Merged
merged 1 commit into from
Mar 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@
* [ENHANCEMENT] Query Frontend/Scheduler: Time check in query priority now considers overall data select time window (including range selectors, modifiers and lookback delta). #5758
* [ENHANCEMENT] Querier: Added `querier.store-gateway-query-stats-enabled` to enable or disable store gateway query stats log. #5749
* [ENHANCEMENT] Upgrade to go 1.21.6. #5765
* [ENHANCEMENT] AlertManager: Retrying AlertManager Delete Silence on error #5794
* [BUGFIX] Distributor: Do not use label with empty values for sharding #5717
* [BUGFIX] Query Frontend: queries with negative offset should check whether it is cacheable or not. #5719
* [BUGFIX] Redis Cache: pass `cache_size` config correctly. #5734
Expand Down
4 changes: 2 additions & 2 deletions pkg/alertmanager/distributor.go
Original file line number Diff line number Diff line change
Expand Up @@ -247,14 +247,14 @@ func (d *Distributor) doUnary(userID string, w http.ResponseWriter, r *http.Requ
// we forward the request to only only of the alertmanagers.

var instances []ring.InstanceDesc
if req.GetMethod() == "GET" {
if req.GetMethod() == "GET" || req.GetMethod() == "DELETE" {
instances = replicationSet.Instances
// Randomize the list of instances to not always query the same one.
rand.Shuffle(len(instances), func(i, j int) {
instances[i], instances[j] = instances[j], instances[i]
})
} else {
//Picking 1 instance at Random for Non-Get Unary Read requests, as shuffling through large number of instances might increase complexity
//Picking 1 instance at Random for Non-Get and Non-Delete Unary Read requests, as shuffling through large number of instances might increase complexity
randN := rand.Intn(len(replicationSet.Instances))
instances = replicationSet.Instances[randN : randN+1]
}
Expand Down
18 changes: 18 additions & 0 deletions pkg/alertmanager/distributor_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -200,6 +200,24 @@ func TestDistributor_DistributeRequest(t *testing.T) {
expStatusCode: http.StatusOK,
expectedTotalCalls: 1,
route: "/silence/id",
}, {
name: "Delete /silence/id should try all alert managers on error",
numAM: 3,
numHappyAM: 0,
replicationFactor: 3,
isDelete: true,
expStatusCode: http.StatusInternalServerError,
expectedTotalCalls: 3,
route: "/silence/id",
}, {
name: "Delete /silence/id is sent to 3 AM when 2 are not happy",
numAM: 3,
numHappyAM: 1,
replicationFactor: 3,
isDelete: true,
expStatusCode: http.StatusOK,
expectedTotalCalls: 3,
route: "/silence/id",
}, {
name: "Read /status is sent to only 1 AM",
numAM: 5,
Expand Down
Loading