Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PLAT-101389] Merge thanos upstream from release-0.34 #13

Merged
merged 170 commits into from
Feb 16, 2024

Conversation

jnyi
Copy link
Collaborator

@jnyi jnyi commented Feb 15, 2024

Compared with latest release-0.34

There are quite some fixes in master branch:

* 0f9fc9ec - (HEAD -> merge-upstream, origin/merge-upstream) fix docs and linter (3 minutes ago) <Yi Jin>
*   5ddcdaa6 - Merge branch 'main' into merge-upstream (16 hours ago) <Yi Jin>
|\
| * 4a82ba7f - (upstream/main, upstream/HEAD, main) Adding new method on BucketedBytes to expose used memory (#7137) (35 hours ago) <Pedro Tanaka>
| * e78d8673 - (origin/main, databricks/main, databricks/HEAD, db_main) Fixing log line for remote engine in debug mode (#7133) (3 days ago) <Pedro Tanaka>
| *   f5ca5a84 - Merge pull request #7132 from bavarianbidi/update_helm_installation_instruction (3 days ago) <Filip Petkovski>
| |\
| | * 7640f0f7 - docs: run make docs for helm installation instruction (3 days ago) <Mario Constanti>
| | *   8ffb953c - Merge branch 'main' into update_helm_installation_instruction (3 days ago) <Mario Constanti>
| | |\
| | |/
| |/|
| * | f28680ce - docs: fix link (#7129) (4 days ago) <Giedrius Statkevičius>
| | * 0bf17ae2 - docs: update helm installation instruction (4 days ago) <Mario Constanti>
| |/
| * 3da5c1c2 - Receive: dont rely on slice labels (#7100) (6 days ago) <Michael Hoffmann>
| * 21ed9bbc - default to alertmanager v2 api (#7123) (6 days ago) <Jake Keeys>
| * 29831f84 - receive/handler: do not double lock (#7124) (6 days ago) <Giedrius Statkevičius>
| * 37092db5 - fix minio store gateway err (#7114) (8 days ago) <Kartikay>
| * 94f971bc - receive/handler: fix locking twice (#7112) (2 weeks ago) <Giedrius Statkevičius>
| * 50ce7a28 - Update prometheus/prometheus (#7096) (2 weeks ago) <Filip Petkovski>
| *   13e15580 - Merge pull request #7099 from MichaHoffmann/mhoffm-dont-use-slice-labels-continued (2 weeks ago) <Michael Hoffmann>
| |\
| | * 2f861d85 - Store: dont rely on slice labels continued (3 weeks ago) <Michael Hoffmann>
| * |   925e31a5 - Merge pull request #7101 from MichaHoffmann/merge-release-0.34-to-main (2 weeks ago) <Michael Hoffmann>
| |\ \
| | |/
| |/|
| | *   9eb6591c - Merge remote-tracking branch 'origin/main' into merge-release-0.34-to-main (3 weeks ago) <Michael Hoffmann>
| | |\
| | |/
| |/|
| * | 6a0a4910 - all: get rid of query pushdown to simplify query path (#7014) (3 weeks ago) <Michael Hoffmann>
| * | 1cf333e2 - Stores: convert tests to not rely on slice labels (#7098) (3 weeks ago) <Michael Hoffmann>
| * | daa34a52 - receive: use async remote writing (#7045) (3 weeks ago) <Giedrius Statkevičius>
| * | fce0fe24 - receive: race condition in handler Close() when stopped early (#7087) (3 weeks ago) <Mikhail Nozdrachev>
| * | b4aee0ef - Store: fix label values edge case (#7082) (3 weeks ago) <Michael Hoffmann>
| * | e215fa59 - Fix lazy postings with zero length (#7083) (3 weeks ago) <Ben Ye>
| * | 6b18338c - Store: acceptance test for proxy store (#7084) (4 weeks ago) <Michael Hoffmann>
| * | 058f9207 - Upgrade grpc to 1.57.2 (#7078) (4 weeks ago) <hanyuting8>
| * | 4a73fc3c - Receive: refactor handler for improved readability and organization (#6898) (4 weeks ago) <Douglas Camata>
| * |   a0ce64d2 - Merge pull request #7065 from vinted/multitsdb_overlapping (4 weeks ago) <Michael Hoffmann>
| |\ \
| | * | 80a5ce6b - receive: disable overlapping compaction (4 weeks ago) <Giedrius Statkevičius>
| * | | 3de122f3 - CI: Ensure static react-app is checked in (#7063) (4 weeks ago) <Jacob Baungård Hansen>
| |/ /
| * | 324846f6 - Make RetryError and HaltError able to be fetched for root cause (#7043) (4 weeks ago) <Alex Le>
| * | bee20b9d - go.mod: update Prometheus version (#7047) (4 weeks ago) <Giedrius Statkevičius>
| * | a7e8a644 - UI: Don't always force tracing (#7062) (5 weeks ago) <Jacob Baungård Hansen>
| | * 18d740f2 - (tag: v0.34.0, upstream/release-0.34, release-0.34) CHANGELOG: cut release 0.34 (#7095) (3 weeks ago) <Michael Hoffmann>
  • I added CHANGELOG entry for this change.
  • Change is not relevant to the end user.

Changes

Verification

yeya24 and others added 30 commits October 9, 2023 10:58
* Return Query Analysis in API

A param  is added to QueryAPI, if true then query analysis is
returned by the  method of the query having structure
 is returned in response.

Signed-off-by: nishchay-veer <[email protected]>

* Added analyze checkbox in Thanos UI

A analyze checkbox is added to the thanos query api, that requests for operator telemetry which includes CPU Time

Signed-off-by: nishchay-veer <[email protected]>

* Return Query Analysis in API

A param  is added to QueryAPI, if true then query analysis is
returned by the  method of the query having structure
 is returned in response.

Signed-off-by: nishchay-veer <[email protected]>

* Added analyze checkbox in Thanos UI

A analyze checkbox is added to the thanos query api, that requests for operator telemetry which includes CPU Time

Signed-off-by: nishchay-veer <[email protected]>

* Add query explain API

Signed-off-by: Saswata Mukherjee <[email protected]>

* Hooked queryTelemetry data into UI

Signed-off-by: nishchay-veer <[email protected]>

* /query_explain and /query_range_explain for explain-tree

Signed-off-by: nishchay-veer <[email protected]>

* update promql-engine

Signed-off-by: nishchay-veer <[email protected]>

* Execution time shows 0s

Signed-off-by: nishchay-veer <[email protected]>

* Show execution time of operators

Signed-off-by: nishchay-veer <[email protected]>

* Removing QueryExplainParam from query api

Signed-off-by: nishchay-veer <[email protected]>

* bad request format in Explain

Signed-off-by: nishchay-veer <[email protected]>

* Showing Expalin and Analyze Output

Signed-off-by: nishchay-veer <[email protected]>

* Added tooltip and different enpoints for table and graph queries

Signed-off-by: nishchay-veer <[email protected]>

* Linters pass

Signed-off-by: nishchay-veer <[email protected]>

* disable Explain when engine is 'prometheus'

Signed-off-by: nishchay-veer <[email protected]>

* passing query params to explain endpoints

Signed-off-by: nishchay-veer <[email protected]>

* fixed react test case failing

Signed-off-by: nishchay-veer <[email protected]>

* fix ui tests

Signed-off-by: nishchay-veer <[email protected]>

* fix some e2e test fails

Signed-off-by: nishchay-veer <[email protected]>

* added customised tooltip in place of Tooltip component

Signed-off-by: nishchay-veer <[email protected]>

* removed Tooltip from Panel

Signed-off-by: nishchay-veer <[email protected]>

* Linters pass

Signed-off-by: nishchay-veer <[email protected]>

* 4 arguments in QueryInstant

Signed-off-by: nishchay-veer <[email protected]>

* resolving conflicts -2

Signed-off-by: nishchay-veer <[email protected]>

* resolving conflicts in Panel.tsx

Signed-off-by: nishchay-veer <[email protected]>

* adding checkbox

Signed-off-by: nishchay-veer <[email protected]>

* fixing linters fail

Signed-off-by: nishchay-veer <[email protected]>

---------

Signed-off-by: nishchay-veer <[email protected]>
Signed-off-by: Saswata Mukherjee <[email protected]>
Signed-off-by: Nishchay Veer <[email protected]>
Co-authored-by: Saswata Mukherjee <[email protected]>
…6789)

No need to show the symbol if analyze is disabled. It looks weird. Let's
not do that.

Signed-off-by: Giedrius Statkevičius <[email protected]>
Two of the same names are used in e2e environment names. Fix this name
clash.

Signed-off-by: Giedrius Statkevičius <[email protected]>
* set dialer timeout to 5s in NewRoundTripperFromConfig

Signed-off-by: Walther Lee <[email protected]>

* add dialer_timeout field to HTTP TransportConfig

Signed-off-by: Walther Lee <[email protected]>

---------

Signed-off-by: Walther Lee <[email protected]>
Co-authored-by: Walther Lee <[email protected]>
Running tests with -race shows that there is a race between
bapi.blocks() and bapi.SetLoaded/SetGlobal() because the latter is
called continuously and asynchronously in a different thread. blocks()
is called through the HTTP API. Since block info is immutable, it is
enough to add a lock here to fix this problem.

Signed-off-by: Giedrius Statkevičius <[email protected]>
…hanos-io#6787)

* initialize new query stats struct at each goroutine

Signed-off-by: Ben Ye <[email protected]>

* remove comment

Signed-off-by: Ben Ye <[email protected]>

* address feedback

Signed-off-by: Ben Ye <[email protected]>

* fix lint

Signed-off-by: Ben Ye <[email protected]>

---------

Signed-off-by: Ben Ye <[email protected]>
Fix a race where GetPrometheusEngine or GetThanosEngine is called twice
at the same time from multiple HTTP requests. This fixes the race:

```
10:29:50 querier-query: ==================
10:29:50 querier-query: WARNING: DATA RACE
10:29:50 querier-query: Write at 0x00c0005fa0f8 by goroutine 285:
10:29:50 querier-query: github.com/thanos-io/thanos/pkg/api/query.(*QueryEngineFactory).GetPrometheusEngine()
10:29:50 querier-query: /go/src/github.com/thanos-io/thanos/pkg/api/query/v1.go:105 +0x1f9
10:29:50 querier-query: github.com/thanos-io/thanos/pkg/api/query.(*QueryAPI).parseEngineParam()
10:29:50 querier-query: /go/src/github.com/thanos-io/thanos/pkg/api/query/v1.go:325 +0x109
10:29:50 querier-query: github.com/thanos-io/thanos/pkg/api/query.(*QueryAPI).query()
10:29:50 querier-query: /go/src/github.com/thanos-io/thanos/pkg/api/query/v1.go:626 +0x605
10:29:50 querier-query: github.com/thanos-io/thanos/pkg/api/query.(*QueryAPI).query-fm()
...
10:29:50 querier-query: Previous read at 0x00c0005fa0f8 by goroutine 287:
10:29:50 querier-query: github.com/thanos-io/thanos/pkg/api/query.(*QueryEngineFactory).GetPrometheusEngine()
10:29:50 querier-query: /go/src/github.com/thanos-io/thanos/pkg/api/query/v1.go:101 +0x13d
10:29:50 querier-query: github.com/thanos-io/thanos/pkg/api/query.(*QueryAPI).parseEngineParam()
10:29:50 querier-query: /go/src/github.com/thanos-io/thanos/pkg/api/query/v1.go:325 +0x109
10:29:50 querier-query: github.com/thanos-io/thanos/pkg/api/query.(*QueryAPI).query()
10:29:50 querier-query: /go/src/github.com/thanos-io/thanos/pkg/api/query/v1.go:626 +0x605
10:29:50 querier-query: github.com/thanos-io/thanos/pkg/api/query.(*QueryAPI).query-fm()
...
```

Signed-off-by: Giedrius Statkevičius <[email protected]>
Each tracing.StartSpan() writes a value into the given context so
there's a race if we keep reusing the same context. Fix this by starting
a new span in each goroutine. This also makes logical sense. Fixes the
following race:

```
15:21:13 querier-1: WARNING: DATA RACE
15:21:13 querier-1: Read at 0x00c0009c5050 by goroutine 328:
15:21:13 querier-1: context.(*valueCtx).Value()
15:21:13 querier-1: /usr/local/go/src/context/context.go:751 +0x76
15:21:13 querier-1: github.com/grpc-ecosystem/go-grpc-middleware/v2/interceptors/tracing.newClientSpanFromContext()
15:21:13 querier-1: /go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware/[email protected]/interceptors/tracing/client.go:87 +0x241
15:21:13 querier-1: github.com/grpc-ecosystem/go-grpc-middleware/v2/interceptors/tracing.(*opentracingClientReportable).ClientReporter()
15:21:13 querier-1: /go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware/[email protected]/interceptors/tracing/client.go:51 +0x195
15:21:13 querier-1: github.com/grpc-ecosystem/go-grpc-middleware/v2/interceptors/tracing.UnaryClientInterceptor.UnaryClientInterceptor.func1()
15:21:13 querier-1: /go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware/[email protected]/interceptors/client.go:19 +0x1a9
15:21:13 querier-1: github.com/thanos-io/thanos/pkg/extgrpc.StoreClientGRPCOpts.ChainUnaryClient.func4.1.1()
15:21:13 querier-1: /go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware/[email protected]/chain.go:74 +0x10a
15:21:13 querier-1: github.com/thanos-io/thanos/pkg/extgrpc.StoreClientGRPCOpts.(*ClientMetrics).UnaryClientInterceptor.func3()
15:21:13 querier-1: /go/pkg/mod/github.com/grpc-ecosystem/[email protected]/client_metrics.go:112 +0x126
15:21:13 querier-1: github.com/thanos-io/thanos/pkg/extgrpc.StoreClientGRPCOpts.ChainUnaryClient.func4.1.1()
15:21:13 querier-1: /go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware/[email protected]/chain.go:74 +0x10a
15:21:13 querier-1: github.com/thanos-io/thanos/pkg/extgrpc.StoreClientGRPCOpts.ChainUnaryClient.func4()
15:21:13 querier-1: /go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware/[email protected]/chain.go:83 +0x17b
15:21:13 querier-1: google.golang.org/grpc.(*ClientConn).Invoke()
15:21:13 querier-1: /go/pkg/mod/google.golang.org/[email protected]/call.go:35 +0x25d
15:21:13 querier-1: github.com/thanos-io/thanos/pkg/store/storepb.(*storeClient).LabelValues()
15:21:13 querier-1: /go/src/github.com/thanos-io/thanos/pkg/store/storepb/rpc.pb.go:1034 +0xe5
15:21:13 querier-1: github.com/thanos-io/thanos/pkg/query.(*endpointRef).LabelValues()
15:21:13 querier-1: <autogenerated>:1 +0xa1                                                                                                                                        15:21:13 querier-1: github.com/thanos-io/thanos/pkg/store.(*ProxyStore).LabelValues.func1()
15:21:13 querier-1: /go/src/github.com/thanos-io/thanos/pkg/store/proxy.go:586 +0x323
15:21:13 querier-1: golang.org/x/sync/errgroup.(*Group).Go.func1()
15:21:13 querier-1: /go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:75 +0x76
15:21:13 querier-1: Previous write at 0x00c0009c5050 by goroutine 325:
15:21:13 querier-1: context.WithValue()
15:21:13 querier-1: /usr/local/go/src/context/context.go:718 +0xce
15:21:13 querier-1: github.com/opentracing/opentracing-go.ContextWithSpan()
15:21:13 querier-1: /go/pkg/mod/github.com/opentracing/[email protected]/gocontext.go:17 +0xec
15:21:13 querier-1: github.com/thanos-io/thanos/pkg/tracing.StartSpan()
15:21:13 querier-1: /go/src/github.com/thanos-io/thanos/pkg/tracing/tracing.go:73 +0x238
15:21:13 querier-1: github.com/thanos-io/thanos/pkg/store.(*ProxyStore).LabelValues()
15:21:13 querier-1: /go/src/github.com/thanos-io/thanos/pkg/store/proxy.go:567 +0xb25
15:21:13 querier-1: github.com/thanos-io/thanos/pkg/query.(*querier).LabelValues()
15:21:13 querier-1: /go/src/github.com/thanos-io/thanos/pkg/query/querier.go:422 +0x3f5
15:21:13 querier-1: github.com/thanos-io/thanos/pkg/api/query.(*QueryAPI).labelValues()
15:21:13 querier-1: /go/src/github.com/thanos-io/thanos/pkg/api/query/v1.go:1092 +0x17d1
15:21:13 querier-1: github.com/thanos-io/thanos/pkg/api/query.(*QueryAPI).labelValues-fm()
15:21:13 querier-1: <autogenerated>:1 +0x45
15:21:13 querier-1: github.com/thanos-io/thanos/pkg/api/query.(*QueryAPI).Register.GetInstr.func1.1()
```

Signed-off-by: Giedrius Statkevičius <[email protected]>
Return copy of the map because the compactor runs garbage collector
concurrently that deletes entries from the original map. Fixes race:

```
10:55:35 compact-working-dedup: ==================
10:55:35 compact-working-dedup: WARNING: DATA RACE
10:55:35 compact-working-dedup: Write at 0x00c001822150 by goroutine 220:
10:55:35 compact-working-dedup: runtime.mapdelete()
10:55:35 compact-working-dedup: /usr/local/go/src/runtime/map.go:696 +0x0
10:55:35 compact-working-dedup: github.com/thanos-io/thanos/pkg/compact.(*Syncer).GarbageCollect()
10:55:35 compact-working-dedup: /go/src/github.com/thanos-io/thanos/pkg/compact/compact.go:201 +0x324
10:55:35 compact-working-dedup: github.com/thanos-io/thanos/pkg/compact.(*BucketCompactor).Compact()
10:55:35 compact-working-dedup: /go/src/github.com/thanos-io/thanos/pkg/compact/compact.go:1422 +0x60f
10:55:35 compact-working-dedup: main.runCompact.func7()
10:55:35 compact-working-dedup: /go/src/github.com/thanos-io/thanos/cmd/thanos/compact.go:426 +0xfa
10:55:35 compact-working-dedup: main.runCompact.func8.1()
10:55:35 compact-working-dedup: /go/src/github.com/thanos-io/thanos/cmd/thanos/compact.go:481 +0x69
10:55:35 compact-working-dedup: github.com/thanos-io/thanos/pkg/runutil.Repeat()
10:55:35 compact-working-dedup: /go/src/github.com/thanos-io/thanos/pkg/runutil/runutil.go:74 +0xc3
10:55:35 compact-working-dedup: main.runCompact.func8()
10:55:35 compact-working-dedup: /go/src/github.com/thanos-io/thanos/cmd/thanos/compact.go:480 +0x224
10:55:35 compact-working-dedup: github.com/oklog/run.(*Group).Run.func1()
10:55:35 compact-working-dedup: /go/pkg/mod/github.com/oklog/[email protected]/group.go:38 +0x39
10:55:35 compact-working-dedup: github.com/oklog/run.(*Group).Run.func2()
10:55:35 compact-working-dedup: /go/pkg/mod/github.com/oklog/[email protected]/group.go:39 +0x4f
10:55:35 compact-working-dedup: Previous read at 0x00c001822150 by goroutine 223:
10:55:35 compact-working-dedup: runtime.mapiternext()
10:55:35 compact-working-dedup: /usr/local/go/src/runtime/map.go:867 +0x0
10:55:35 compact-working-dedup: github.com/thanos-io/thanos/pkg/compact.(*DefaultGrouper).Groups()
10:55:35 compact-working-dedup: /go/src/github.com/thanos-io/thanos/pkg/compact/compact.go:289 +0xfd
10:55:35 compact-working-dedup: main.runCompact.func16.1()
10:55:35 compact-working-dedup: /go/src/github.com/thanos-io/thanos/cmd/thanos/compact.go:626 +0x4ae
10:55:35 compact-working-dedup: github.com/thanos-io/thanos/pkg/runutil.Repeat()
10:55:35 compact-working-dedup: /go/src/github.com/thanos-io/thanos/pkg/runutil/runutil.go:74 +0xc3
10:55:35 compact-working-dedup: main.runCompact.func16()
10:55:35 compact-working-dedup: /go/src/github.com/thanos-io/thanos/cmd/thanos/compact.go:591 +0x3f9
10:55:35 compact-working-dedup: github.com/oklog/run.(*Group).Run.func1()
10:55:35 compact-working-dedup: /go/pkg/mod/github.com/oklog/[email protected]/group.go:38 +0x39
10:55:35 compact-working-dedup: github.com/oklog/run.(*Group).Run.func2()
10:55:35 compact-working-dedup: /go/pkg/mod/github.com/oklog/[email protected]/group.go:39 +0x4f
10:55:35 compact-working-dedup: Goroutine 220 (running) created at:
10:55:35 compact-working-dedup: github.com/oklog/run.(*Group).Run()
10:55:35 compact-working-dedup: /go/pkg/mod/github.com/oklog/[email protected]/group.go:37 +0xad
10:55:35 compact-working-dedup: main.main()
10:55:35 compact-working-dedup: /go/src/github.com/thanos-io/thanos/cmd/thanos/main.go:159 +0x2964
```

Signed-off-by: Giedrius Statkevičius <[email protected]>
)

Bumps [golang.org/x/net](https://github.com/golang/net) from 0.14.0 to 0.17.0.
- [Commits](golang/net@v0.14.0...v0.17.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: GitHub <[email protected]>
Co-authored-by: fpetkovski <[email protected]>
* fix matchersToPostingGroups vals variable shadow bug

Signed-off-by: Ben Ye <[email protected]>

* update changelog

Signed-off-by: Ben Ye <[email protected]>

---------

Signed-off-by: Ben Ye <[email protected]>
…ls (thanos-io#6816)

External Labels should also be tested for matches against the matchers.

Signed-off-by: Michael Hoffmann <[email protected]>
* Build with Go 1.21 (thanos-io#6615)

* Build with Go 1.21



* Update tools



---------



* update go alpine image to 3.18 (thanos-io#6750)



* build(deps): bump golang.org/x/net from 0.14.0 to 0.17.0 (thanos-io#6805)

Bumps [golang.org/x/net](https://github.com/golang/net) from 0.14.0 to 0.17.0.
- [Commits](golang/net@v0.14.0...v0.17.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: direct:production
...




* Updates busybox SHA (thanos-io#6808)




* Fix matchersToPostingGroups vals variable shadow bug (thanos-io#6817)

* fix matchersToPostingGroups vals variable shadow bug



* update changelog



---------



* fix head series limiter trigger (thanos-io#6802)



* Store: fix prometheus store label values for matches on external labels (thanos-io#6816)

External Labels should also be tested for matches against the matchers.



* Cut patch release v0.32.5



* Revert "Fix matchersToPostingGroups vals variable shadow bug (thanos-io#6817)"

This reverts commit 4ed9bb0.



---------

Signed-off-by: Saswata Mukherjee <[email protected]>
Signed-off-by: Coleen Iona Quadros <[email protected]>
Signed-off-by: dependabot[bot] <[email protected]>
Signed-off-by: GitHub <[email protected]>
Signed-off-by: Ben Ye <[email protected]>
Signed-off-by: Thibault Mange <[email protected]>
Signed-off-by: Michael Hoffmann <[email protected]>
Co-authored-by: Coleen Iona Quadros <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: fpetkovski <[email protected]>
Co-authored-by: Ben Ye <[email protected]>
Co-authored-by: Thibault Mange <[email protected]>
Co-authored-by: Michael Hoffmann <[email protected]>
* receive/handler: fix label names/values race

There is a label name/value race in the current loop because
`labelpb.ReAllocZLabelsStrings(&t.Labels, r.opts.Intern)` might be
called which overwrites the original labels. At the same time, we might
also be forwarding the same request through gRPC to other Receive nodes.

Fixes the following race:

<details>
<summary>Trace of the race</summary>

10:53:51 receive-1: WARNING: DATA RACE
10:53:51 receive-1: Read at 0x00c001097b90 by goroutine 361:
10:53:51 receive-1: github.com/thanos-io/thanos/pkg/store/labelpb.(*ZLabel).Size()
10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/store/labelpb/label.go:273 +0x35
10:53:51 receive-1: github.com/thanos-io/thanos/pkg/store/storepb/prompb.(*TimeSeries).MarshalToSizedBuffer()
10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/store/storepb/prompb/types.pb.go:1499 +0x7c4
10:53:51 receive-1: github.com/thanos-io/thanos/pkg/store/storepb.(*WriteRequest).MarshalToSizedBuffer()
10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/store/storepb/rpc.pb.go:1318 +0x409
10:53:51 receive-1: github.com/thanos-io/thanos/pkg/store/storepb.(*WriteRequest).Marshal()
10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/store/storepb/rpc.pb.go:1286 +0x64
10:53:51 receive-1: google.golang.org/protobuf/internal/impl.legacyMarshal()
10:53:51 receive-1: /go/pkg/mod/google.golang.org/[email protected]/internal/impl/legacy_message.go:402 +0xb1
10:53:51 receive-1: google.golang.org/protobuf/proto.MarshalOptions.marshal()
10:53:51 receive-1: /go/pkg/mod/google.golang.org/[email protected]/proto/encode.go:166 +0x3a2
10:53:51 receive-1: google.golang.org/protobuf/proto.MarshalOptions.MarshalAppend()
10:53:51 receive-1: /go/pkg/mod/google.golang.org/[email protected]/proto/encode.go:125 +0x96
10:53:51 receive-1: github.com/golang/protobuf/proto.marshalAppend()
10:53:51 receive-1: /go/pkg/mod/github.com/golang/[email protected]/proto/wire.go:40 +0xce
10:53:51 receive-1: github.com/golang/protobuf/proto.Marshal()
10:53:51 receive-1: /go/pkg/mod/github.com/golang/[email protected]/proto/wire.go:23 +0x65
10:53:51 receive-1: google.golang.org/grpc/encoding/proto.codec.Marshal()
10:53:51 receive-1: /go/pkg/mod/google.golang.org/[email protected]/encoding/proto/proto.go:45 +0x66
10:53:51 receive-1: google.golang.org/grpc/encoding/proto.(*codec).Marshal()
10:53:51 receive-1: <autogenerated>:1 +0x53
10:53:51 receive-1: google.golang.org/grpc.encode()
10:53:51 receive-1: /go/pkg/mod/google.golang.org/[email protected]/rpc_util.go:594 +0x64
10:53:51 receive-1: google.golang.org/grpc.prepareMsg()
10:53:51 receive-1: /go/pkg/mod/google.golang.org/[email protected]/stream.go:1610 +0x1a8
10:53:51 receive-1: google.golang.org/grpc.(*clientStream).SendMsg()
10:53:51 receive-1: /go/pkg/mod/google.golang.org/[email protected]/stream.go:791 +0x284
10:53:51 receive-1: google.golang.org/grpc.invoke()
10:53:51 receive-1: /go/pkg/mod/google.golang.org/[email protected]/call.go:70 +0xf2

...
10:53:51 receive-1: Previous write at 0x00c001097b90 by goroutine 357:
10:53:51 receive-1: github.com/thanos-io/thanos/pkg/store/labelpb.ReAllocZLabelsStrings()
10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/store/labelpb/label.go:69 +0x25e
10:53:51 receive-1: github.com/thanos-io/thanos/pkg/receive.(*Writer).Write()
10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/receive/writer.go:144 +0x13e4
10:53:51 receive-1: github.com/thanos-io/thanos/pkg/receive.(*Handler).fanoutForward.func2.1()
10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/receive/handler.go:672 +0x153
10:53:51 receive-1: github.com/thanos-io/thanos/pkg/tracing.DoInSpan()
10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/tracing/tracing.go:95 +0x125
10:53:51 receive-1: github.com/thanos-io/thanos/pkg/receive.(*Handler).fanoutForward.func2()
10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/receive/handler.go:671 +0x1fd
10:53:51 receive-1: github.com/thanos-io/thanos/pkg/receive.(*Handler).fanoutForward.func6()
10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/receive/handler.go:682 +0x61
10:53:51 receive-1: Goroutine 361 (running) created at:
10:53:51 receive-1: github.com/thanos-io/thanos/pkg/receive.(*Handler).fanoutForward()
10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/receive/handler.go:688 +0x9c7
10:53:51 receive-1: github.com/thanos-io/thanos/pkg/receive.(*Handler).forward()
10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/receive/handler.go:612 +0x53a
10:53:51 receive-1: github.com/thanos-io/thanos/pkg/receive.(*Handler).handleRequest()
10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/receive/handler.go:417 +0xca8
10:53:51 receive-1: github.com/thanos-io/thanos/pkg/receive.(*Handler).receiveHTTP()
10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/receive/handler.go:539 +0x1d89
10:53:51 receive-1: github.com/thanos-io/thanos/pkg/receive.(*Handler).receiveHTTP-fm()
10:53:51 receive-1: <autogenerated>:1 +0x51
10:53:51 receive-1: net/http.HandlerFunc.ServeHTTP()
10:53:51 receive-1: /usr/local/go/src/net/http/server.go:2136 +0x47
10:53:51 receive-1: github.com/thanos-io/thanos/pkg/receive.NewHandler.RequestID.func2()
10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/server/http/middleware/request_id.go:40 +0x191
10:53:51 receive-1: github.com/thanos-io/thanos/pkg/receive.(*Handler).testReady-fm.(*Handler).testReady.func1()
10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/receive/handler.go:263 +0x249
10:53:51 receive-1: net/http.HandlerFunc.ServeHTTP()
10:53:51 receive-1: /usr/local/go/src/net/http/server.go:2136 +0x47
10:53:51 receive-1: github.com/thanos-io/thanos/pkg/extprom/http.httpInstrumentationHandler.func1()

</details>

Signed-off-by: Giedrius Statkevičius <[email protected]>

* receive/handler: remove break

Signed-off-by: Giedrius Statkevičius <[email protected]>

---------

Signed-off-by: Giedrius Statkevičius <[email protected]>
…projects (thanos-io#6827)

* Expose fetcher and syncer metrics to be provided by depending projects.

Signed-off-by: Alex Le <[email protected]>

* Updated CHANGELOG

Signed-off-by: Alex Le <[email protected]>

* Remove CHANGELOG change

Signed-off-by: Alex Le <[email protected]>

---------

Signed-off-by: Alex Le <[email protected]>
We are re-reading the limits configuration periodically and also reading
it at the same time hence we need a lock around it. Thus, let's make
that struct member private and add a getter that returns the limiter
under a mutex lock.

Fixes:

```
17:14:45 receive-i3: WARNING: DATA RACE
17:14:45 receive-i3: Read at 0x00c00090aec0 by goroutine 131:
17:14:45 receive-i3: github.com/thanos-io/thanos/pkg/receive.(*headSeriesLimit).QueryMetaMonitoring()
17:14:45 receive-i3: /go/src/github.com/thanos-io/thanos/pkg/receive/head_series_limiter.go:109 +0x2fb
17:14:45 receive-i3: main.runReceive.func9.1()
17:14:45 receive-i3: /go/src/github.com/thanos-io/thanos/cmd/thanos/receive.go:402 +0x9b
17:14:45 receive-i3: github.com/thanos-io/thanos/pkg/runutil.Repeat()
17:14:45 receive-i3: /go/src/github.com/thanos-io/thanos/pkg/runutil/runutil.go:74 +0xc3
17:14:45 receive-i3: Previous write at 0x00c00090aec0 by goroutine 138:
17:14:45 receive-i3: github.com/thanos-io/thanos/pkg/receive.NewHeadSeriesLimit()
17:14:45 receive-i3: /go/src/github.com/thanos-io/thanos/pkg/receive/head_series_limiter.go:41 +0x316
17:14:45 receive-i3: github.com/thanos-io/thanos/pkg/receive.(*Limiter).loadConfig()
17:14:45 receive-i3: /go/src/github.com/thanos-io/thanos/pkg/receive/limiter.go:168 +0xd0d
17:14:45 receive-i3: github.com/thanos-io/thanos/pkg/receive.(*Limiter).StartConfigReloader.func1()
17:14:45 receive-i3: /go/src/github.com/thanos-io/thanos/pkg/receive/limiter.go:111 +0x207
17:14:45 receive-i3: github.com/thanos-io/thanos/pkg/extkingpin.(*pollingEngine).start.func1()
```

Signed-off-by: Giedrius Statkevičius <[email protected]>
Fix the following race:

```
12:36:39 querier-1: ==================
12:36:39 querier-1: WARNING: DATA RACE
12:36:39 querier-1: Read at 0x00c000159540 by goroutine 341:
12:36:39 querier-1: reflect.Value.String()
12:36:39 querier-1: /usr/local/go/src/reflect/value.go:2589 +0xd76
12:36:39 querier-1: github.com/gogo/protobuf/proto.(*TextMarshaler).writeAny()
12:36:39 querier-1: /go/pkg/mod/github.com/gogo/[email protected]/proto/text.go:563 +0xd86
12:36:39 querier-1: github.com/gogo/protobuf/proto.(*TextMarshaler).writeStruct()
12:36:39 querier-1: /go/pkg/mod/github.com/gogo/[email protected]/proto/text.go:325 +0x19db
12:36:39 querier-1: github.com/gogo/protobuf/proto.(*TextMarshaler).writeAny()
12:36:39 querier-1: /go/pkg/mod/github.com/gogo/[email protected]/proto/text.go:606 +0xb2a
12:36:39 querier-1: github.com/gogo/protobuf/proto.(*TextMarshaler).writeStruct()
12:36:39 querier-1: /go/pkg/mod/github.com/gogo/[email protected]/proto/text.go:453 +0xdd6
12:36:39 querier-1: github.com/gogo/protobuf/proto.(*TextMarshaler).writeAny()
12:36:39 querier-1: /go/pkg/mod/github.com/gogo/[email protected]/proto/text.go:606 +0xb2a
12:36:39 querier-1: github.com/gogo/protobuf/proto.(*TextMarshaler).writeStruct()
12:36:39 querier-1: /go/pkg/mod/github.com/gogo/[email protected]/proto/text.go:453 +0xdd6
12:36:39 querier-1: github.com/gogo/protobuf/proto.(*TextMarshaler).Marshal()
12:36:39 querier-1: /go/pkg/mod/github.com/gogo/[email protected]/proto/text.go:896 +0x5c8
12:36:39 querier-1: github.com/gogo/protobuf/proto.(*TextMarshaler).Text()
12:36:39 querier-1: /go/pkg/mod/github.com/gogo/[email protected]/proto/text.go:908 +0x92
12:36:39 querier-1: github.com/gogo/protobuf/proto.CompactTextString()
12:36:39 querier-1: /go/pkg/mod/github.com/gogo/[email protected]/proto/text.go:930 +0x8e
12:36:39 querier-1: github.com/thanos-io/thanos/pkg/store/storepb.(*SeriesRequest).String()
12:36:39 querier-1: /go/src/github.com/thanos-io/thanos/pkg/store/storepb/rpc.pb.go:316 +0x7b
12:36:39 querier-1: github.com/thanos-io/thanos/pkg/store.(*ProxyStore).Series()
12:36:39 querier-1: /go/src/github.com/thanos-io/thanos/pkg/store/proxy.go:277 +0x8f
12:36:39 querier-1: github.com/thanos-io/thanos/pkg/query.(*querier).selectFn()

12:36:39 querier-1: Previous write at 0x00c000159540 by goroutine 339:
12:36:39 querier-1: golang.org/x/exp/slices.insertionSortOrdered[go.shape.string]()
12:36:39 querier-1: /go/pkg/mod/golang.org/x/[email protected]/slices/zsortordered.go:15 +0x357
12:36:39 querier-1: golang.org/x/exp/slices.pdqsortOrdered[go.shape.string]()
12:36:39 querier-1: /go/pkg/mod/golang.org/x/[email protected]/slices/zsortordered.go:75 +0x72f
12:36:39 querier-1: golang.org/x/exp/slices.Sort[go.shape.[]string,go.shape.string]()
12:36:39 querier-1: /go/pkg/mod/golang.org/x/[email protected]/slices/sort.go:19 +0x45a
12:36:39 querier-1: github.com/prometheus/prometheus/promql.(*evaluator).eval()
12:36:39 querier-1: /go/pkg/mod/github.com/prometheus/[email protected]/promql/engine.go:1352 +0x432
12:36:39 querier-1: github.com/prometheus/prometheus/promql.(*evaluator).Eval()
12:36:39 querier-1: /go/pkg/mod/github.com/prometheus/[email protected]/promql/engine.go:1052 +0x105
12:36:39 querier-1: github.com/prometheus/prometheus/promql.(*Engine).execEvalStmt()
12:36:39 querier-1: /go/pkg/mod/github.com/prometheus/[email protected]/promql/engine.go:708 +0xb15
12:36:39 querier-1: github.com/prometheus/prometheus/promql.(*Engine).exec()
12:36:39 querier-1: /go/pkg/mod/github.com/prometheus/[email protected]/promql/engine.go:646 +0x4c8
12:36:39 querier-1: github.com/prometheus/prometheus/promql.(*query).Exec()
12:36:39 querier-1: /go/pkg/mod/github.com/prometheus/[email protected]/promql/engine.go:235 +0x232
12:36:39 querier-1: github.com/thanos-io/thanos/pkg/api/query.(*QueryAPI).query()
12:36:39 querier-1: /go/src/github.com/thanos-io/thanos/pkg/api/query/v1.go:681 +0xdfd
12:36:39 querier-1: github.com/thanos-io/thanos/pkg/api/query.(*QueryAPI).query-fm()
12:36:39 querier-1: <autogenerated>:1 +0x45
12:36:39 querier-1: github.com/thanos-io/thanos/pkg/api/query.(*QueryAPI).Register.GetInstr.func1.1()
12:36:39 querier-1: /go/src/github.com/thanos-io/thanos/pkg/api/api.go:212 +0x62
12:36:39 querier-1: net/http.HandlerFunc.ServeHTTP()
12:36:39 querier-1: /usr/local/go/src/net/http/server.go:2136 +0x47
12:36:39 querier-1: github.com/thanos-io/thanos/pkg/logging.(*HTTPServerMiddleware).HTTPMiddleware.func1()
```

Problem is that the same slice is sorted in the PromQL engine whereas
the same hints slice could still be used in other Select() calls where
String() is called and then it reads those hints.

Signed-off-by: Giedrius Statkevičius <[email protected]>
* Adding Grupo Olx as user

Signed-off-by: Nelson Almeida <[email protected]>

* Adding Grupo OLX logo

Signed-off-by: Nelson Almeida <[email protected]>

---------

Signed-off-by: Nelson Almeida <[email protected]>
* Receive: Add default tenant to HTTP metrics

Previously, if the tenant header was empty/not supplied, the exported
metrics would have an empty string as tenant. With this commit we
instead use the default tenant as can be configured with:
`--receive.default-tenant-id`.

Signed-off-by: Jacob Baungard Hansen <[email protected]>

* Query: Add tenant label to exported metrics

With this commit we now add the tenant label to relevant metrics
exported by the query component.

This includes the HTTP metrics handled by the InstrumentationMiddleware
and the query latency metrics.

Signed-off-by: Jacob Baungard Hansen <[email protected]>

---------

Signed-off-by: Jacob Baungard Hansen <[email protected]>
alecrajeev and others added 21 commits January 6, 2024 19:15
Signed-off-by: Harsh Pratap Singh <[email protected]>
removing todo comment from query docs
* Query: add optional tenancy enforcement

With this commit it's now possible to enable enforcement of tenancy. If
tenancy is enabled, a tenant label will be added to queries based on the
tenant information provided by the tenant header, and the
tenant-label-name.

The implementation for query APIs are done by using prom-label-proxy as
library, while the implementation for non-query APIs are written from
scratch.

Signed-off-by: Jacob Baungard Hansen <[email protected]>

* Add changelog entry

Signed-off-by: Jacob Baungard Hansen <[email protected]>

* Query: Add non-default tenant testcase

Signed-off-by: Jacob Baungard Hansen <[email protected]>

* Test: make query a constant to make linter happy

Signed-off-by: Jacob Baungard Hansen <[email protected]>

* Address review comments

- Remove empty lines
- If multiple tenant matchers are found in the original query, we only
  replace the first one with the header provided tenant, and remove any
  subsequent ones.

Signed-off-by: Jacob Baungard Hansen <[email protected]>

* Address review comments

- `--enable-tenancy` -> `--enforce-tenancy`
- Create `RewritePromQL` and `RewriteLabelMatchers` to clean up code in
  query api. Also move getLabelMatchers to tenancy pkg.
- Use prom-label-proxys `EnforceMatchers` to rewrite labels on non-query
  APIs instead of own solution
- Don't specifically handle `illegalLabelMatcherError`

Signed-off-by: Jacob Baungard Hansen <[email protected]>

* Re-arrage go.mod to make linter happy.

Signed-off-by: Jacob Baungard Hansen <[email protected]>

* Address review comments

Minor changes to CLI docs, code-comments and changelog.

Signed-off-by: Jacob Baungard Hansen <[email protected]>

* Query: Add tenancy docs

This commit adds documentation for the tenancy features.

Signed-off-by: Jacob Baungard Hansen <[email protected]>

* Update docs/components/query.md

Review comment

Co-authored-by: Saswata Mukherjee <[email protected]>
Signed-off-by: Jacob Baungård Hansen <[email protected]>

---------

Signed-off-by: Jacob Baungard Hansen <[email protected]>
Signed-off-by: Jacob Baungård Hansen <[email protected]>
Co-authored-by: Saswata Mukherjee <[email protected]>
The e2e tests would occasionally fail due to non-unqiue docker environment
names. With this commit the tests are environments are given unique names
to avoid these failures.

Signed-off-by: Jacob Baungard Hansen <[email protected]>
changed store api's --sync-block-duration to 15m
Fix docs post thanos-io#6539 merge.

Signed-off-by: Filip Petkovski <[email protected]>
…)" (thanos-io#7053)

This reverts commit 7b8eb86.

Proper way to handle this is to disable vertical compaction. I am trying
to add this functionality here:
prometheus/prometheus#13393

Signed-off-by: Giedrius Statkevičius <[email protected]>
Signed-off-by: Kartikay <[email protected]>
Signed-off-by: Michael Hoffmann <[email protected]>
…-as-in-progress

CHANGELOG: mark 0.34 as in progress
Signed-off-by: Michael Hoffmann <[email protected]>
…se-0.34.0-rc.0

VERSION: cut release 0.34.0-rc.0
Signed-off-by: Michael Hoffmann <[email protected]>
…se-0.34.0-rc.1

VERSION: cut release 0.34.0-rc.1
@jnyi jnyi changed the title [PLAT-101389] Merge latest thanos upstream [PLAT-101389] Merge thanos upstream from release-0.34 Feb 16, 2024
Signed-off-by: Yi Jin <[email protected]>
@jnyi jnyi merged commit 08d1755 into databricks:db_main Feb 16, 2024
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.