-
Notifications
You must be signed in to change notification settings - Fork 285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
balancer: rewrite the consistent hashring balancer to avoid recomputations #1310
Conversation
47996e3
to
1ca1002
Compare
1ca1002
to
53c6a19
Compare
53c6a19
to
23c2ed8
Compare
b79aac4
to
36d5e41
Compare
36d5e41
to
501590b
Compare
ReplicationFactor: 100, | ||
Spread: 1, | ||
} | ||
var DefaultBalancerServiceConfigJSON = defaultBalancerServiceConfig.MustToServiceConfigJSON() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doc comments on exported stuff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we find a lint rule for this? I feel like we started it because of lint rules but no linter complains anymore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the config for golangci-lint
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pkg/balancer/hashring.go
Outdated
// Initialize picker to a picker that always returns | ||
// ErrNoSubConnAvailable, because when state of a SubConn changes, we | ||
// may call UpdateState with this picker. | ||
bal.picker = base.NewErrPicker(balancer.ErrNoSubConnAvailable) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this done this way vs just doing so in the struct above?
pkg/balancer/hashring.go
Outdated
logger.Infof("parsed balancer config %s", js) | ||
|
||
if lbCfg.ReplicationFactor == 0 { | ||
lbCfg.ReplicationFactor = 100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we have these defaults be in constants?
if len(info.ReadySCs) == 0 { | ||
return base.NewErrPicker(balancer.ErrNoSubConnAvailable) | ||
func (b *ConsistentHashringBalancer) UpdateClientConnState(s balancer.ClientConnState) error { | ||
if logger.V(2) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we do this kind of log level check anywhere else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is using the grpc logging framework, not logrus
logger.Infof("consistentHashringPicker: Build called with info: %v", info) | ||
if len(info.ReadySCs) == 0 { | ||
return base.NewErrPicker(balancer.ErrNoSubConnAvailable) | ||
func (b *ConsistentHashringBalancer) UpdateClientConnState(s balancer.ClientConnState) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a few more comments describing the workflow here?
pkg/balancer/hashring.go
Outdated
b.cc.UpdateState(balancer.State{ConnectivityState: b.state, Picker: b.picker}) | ||
} | ||
|
||
// Close is a nop because base balancer doesn't have internal state to clean up, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no-op
// Note: this is testing picker behavior and not the hashring | ||
// behavior itself, see `pkg/consistent` for tests of the hashring. | ||
func TestConsistentHashringPickerPick(t *testing.T) { | ||
rnd := rand.New(rand.NewSource(1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a comment here that we are pinning the seed for below
pkg/balancer/hashring_test.go
Outdated
t.Errorf("Pick() error = %v, wantErr %v", err, tt.wantErr) | ||
return | ||
} | ||
if !reflect.DeepEqual(got, tt.want) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
t.True(reflect.DeepEqual(got, tt.want), "Pick() got ...")
?
wantErr bool | ||
}{ | ||
{ | ||
name: "sets rf and spread", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only one test case in a table-driven test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added some more
pkg/balancer/hashring_test.go
Outdated
Spread: tt.spread, | ||
} | ||
got, err := c.ToServiceConfigJSON() | ||
if (err != nil) != tt.wantErr { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I generally prefer to do
// ensure error
return
}
// ensure non-error case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can change it; I sometimes use goland's test scaffolding which writes assertions like this, and I don't always change them to more idiomatic testify
ef2f607
to
5bb867e
Compare
// ParseConfig satisfies balancer.ConfigParser and is used to parse new | ||
// Service Config json. The results are stored on the builder so that | ||
// subsequently built Balancers use the config. | ||
func (b *ConsistentHashringBuilder) ParseConfig(js json.RawMessage) (serviceconfig.LoadBalancingConfig, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems weird that ParseConfig
stores a value, but since that's what the interface requires, guess its okay
pkg/balancer/hashring.go
Outdated
} | ||
// Successful resolution; clear resolver error and ensure we return nil. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
; -> :
pkg/balancer/hashring.go
Outdated
// any that have been removed since the last update are removed from the | ||
// hashring. | ||
addrsSet := resolver.NewAddressMap() | ||
for _, a := range s.ResolverState.Addresses { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe use addr
?
if _, ok := addrsSet.Get(a); !ok { | ||
b.cc.RemoveSubConn(sc) | ||
b.subConns.Delete(a) | ||
// Keep the state of this sc in b.scStates until sc's state becomes Shutdown. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
odd spacing here
pkg/balancer/hashring.go
Outdated
return balancer.ErrBadResolverState | ||
} | ||
|
||
// if the overall connection sate is not in transient failure, we return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sate -> state
pkg/balancer/hashring.go
Outdated
// Pick satisfies balancer.Picker and returns a subconnection to use for a | ||
// request based on the request info. The value stored in CtxKey is hashed | ||
// into the hashring, and the resulting subconnection is used. Note that | ||
// there is no fallback behavior if the subconnection; this prevents the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the subconnection... ?
5bb867e
to
302ba2b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
302ba2b
to
89100db
Compare
The previous implementation was recomputing the hashring any time a subconnection moved from ready->idle or back, which happened frequently. The new implementation includes idle and connecting subconns in the ring, and triggers a connection if one is selected. It also adds/removes from a long-lived ring instead of recomputing a ring from scratch each time. ReplicationFactor and Spread can now be passed in as part of the service config instead of registered globally with the balancer
89100db
to
94145c0
Compare
The previous implementation was recomputing the hashring any time a subconnection moved from ready->idle or back, which happened frequently.
The new implementation includes idle and connecting subconns in the ring, and triggers a connection if one is selected. It also adds/removes from a long-lived ring instead of recomputing a ring from scratch each time.
ReplicationFactor and Spread can now be passed in as part of the service config instead of registered globally with the balancer