Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PLAT-104290] auto detect/remove overlapping blocks to be compacted to avoid compactor panics #23

Merged
merged 4 commits into from
Mar 22, 2024

Conversation

jnyi
Copy link
Collaborator

@jnyi jnyi commented Mar 21, 2024

Adding a flag enableOverlappingRemoval to control this.

This will replace the DefaultCompactionLifecycleCallback as a new callback.

This handles overlapping blocks before actually doing the compaction which could potentially causing halting or panic like thanos-io#6775

  • I added CHANGELOG entry for this change.
  • Change is not relevant to the end user.

Changes

Verification

Deployed to dev, no panics, and overlapped blocks got deleted:

Screenshot 2024-03-21 at 5 45 46 PM Screenshot 2024-03-21 at 5 45 59 PM

@jnyi jnyi changed the title [PLAT-104290] remove overlapping blocks into its own file [PLAT-104290] put remove overlapping blocks into its own source file Mar 21, 2024
@jnyi jnyi force-pushed the PLAT-104290 branch 2 times, most recently from 9298d89 to 283cdac Compare March 22, 2024 01:37
@jnyi jnyi force-pushed the PLAT-104290 branch 2 times, most recently from e5c7ca0 to 22159f6 Compare March 22, 2024 05:39
@@ -591,8 +591,6 @@ func (f *BaseFetcher) fetch(ctx context.Context, metrics *FetcherMetrics, filter
if tenant, ok := m.Thanos.Labels[metadata.TenantLabel]; ok {
numBlocksByTenant[tenant]++
} else {
level.Error(f.logger).Log("msg", "found blocks without label "+metadata.TenantLabel,
Copy link
Collaborator Author

@jnyi jnyi Mar 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is verbose and not useful in testing

@jnyi jnyi requested a review from hczhu-db March 22, 2024 16:40
@jnyi jnyi force-pushed the PLAT-104290 branch 3 times, most recently from 1141c66 to 7445895 Compare March 22, 2024 18:03
@@ -284,10 +283,6 @@ func NewDefaultGrouper(
Name: "thanos_compact_group_vertical_compactions_total",
Help: "Total number of group compaction attempts that resulted in a new block based on overlapping blocks.",
}, []string{"resolution"}),
overlappingBlocks: promauto.With(reg).NewCounterVec(prometheus.CounterOpts{
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this metric to its own source file, this will avoid conflicts when merge from upstream.

@jnyi jnyi changed the title [PLAT-104290] put remove overlapping blocks into its own source file [PLAT-104290] auto detect/remove overlapping blocks to be compacted to avoid compactor panics Mar 22, 2024
@@ -786,6 +789,9 @@ func (cc *compactConfig) registerFlag(cmd extkingpin.FlagClause) {
"NOTE: This flag is ignored and (enabled) when --deduplication.replica-label flag is set.").
Hidden().Default("false").BoolVar(&cc.enableVerticalCompaction)

cmd.Flag("compact.remove-overlapping", "In house flag to remove overlapping blocks. Turn this on to fix PLAT-104290.").
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No point in putting a Jira ticket number in a public repo here. Just briefly describe the issue.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aha, forgot this is public, let me fix it.

@jnyi jnyi force-pushed the PLAT-104290 branch 3 times, most recently from c606168 to e6dbcf2 Compare March 22, 2024 20:05
@jnyi jnyi merged commit 3e50e25 into databricks:db_main Mar 22, 2024
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants