Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Trace] Mimimum changes in dskit in order to migrate to opentelemetry #385

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

ying-jeanne
Copy link
Contributor

@ying-jeanne ying-jeanne commented Sep 21, 2023

What this PR does:

This PR showcases minimal code adjustments in dskit to transition from migrating opentracing to open-telemetry instrumentation. While this PR isn't meant for immediate merging, it serves as a basis for reviewing the changes and creating a design document if we want to maintain backward compatibility for team not ready for migration.

Key Inclusions in this PR:

  • Introduction of oteltracing.go: This file initializes the global tracerProvider. It preserves the Jaeger environment tracing settings, mainly concerning the following environment variables:
JAEGER_AGENT_HOST,JAEGER_TAGS,JAEGER_SAMPLER_MANAGER_HOST_PORT, JAEGER_SERVICE_NAME, JAEGER_SAMPLER_TYPE, and JAEGER_SAMPLER_PARAM

It also maintains compatibility for HTTP/grpc requests by encoding/decoding both opentracing (Jaeger trace header) and opentelemetry (W3C header).

  • Incorporation of spanlogger.go: This file combines a spanLogger object, which bundles a span and logger together. Many projects use this object, even if they don't employ dskit's tracing instrumentation. It exposes a public opentracing.Span accessible to libraries that import it.

  • grpcclient/instrumentation.go: This section addresses the different interceptors used by opentracing and opentelemetry.

  • use of otelhttp and otelhttptrace: These modules are used whenever there is a need to inject context into request headers or extract request headers into context. They replace the previous methods since they manage the injection and extraction of corresponding headers.

  • Utilization of opentelemetry's test tracerprovider with an in-memory exporter: All traces in unit tests have been adapted to use the tracetest tracerprovider provided by opentelemetry.

  • Modification in extracting the traceID from context: Instead of directly casting with jaeger.SpanContext from the context, this PR replaces it with otelSpan.SpanContext().TraceID().

This PR is being testing on dev environment.

The PR is generated by both gopatch and manually update, the gopatch file is located in https://gist.github.com/ying-jeanne/5e57b736bc8f3b1ab299928c54809eac

Which issue(s) this PR fixes:

Fixes #

Checklist

  • Tests updated
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@ying-jeanne ying-jeanne changed the title [WIP] Mimimum changes in dskit in order to migrate to opentelemetry [Otel] Mimimum changes in dskit in order to migrate to opentelemetry Sep 21, 2023
@ying-jeanne ying-jeanne changed the title [Otel] Mimimum changes in dskit in order to migrate to opentelemetry [Trace] Mimimum changes in dskit in order to migrate to opentelemetry Sep 21, 2023
@ying-jeanne
Copy link
Contributor Author

The note for reviewer is above, and let me know if it doesn't appear clear.

@ying-jeanne ying-jeanne force-pushed the otel-migration-build-tag branch 2 times, most recently from 0afbbe7 to 3e456e1 Compare September 25, 2023 18:49
@ying-jeanne ying-jeanne force-pushed the otel-migration-build-tag branch from 6b9bf60 to aef15f7 Compare November 2, 2023 16:42
@ying-jeanne ying-jeanne force-pushed the otel-migration-build-tag branch from aef15f7 to b5e0f97 Compare November 2, 2023 16:44
@davidspek
Copy link

Is this something that will be continued to work on?

@pgillich
Copy link

pgillich commented Dec 16, 2024

Is this something that will be continued to work on?

I'd like to get answer above question, too. To be more evident, I'm interesting to send traces about Mimir source codes. The solution should accept the trace info receiving in HTTP headers and propagate the trace info in outgoing HTTP headers (for example: https://www.w3.org/TR/trace-context/ ). Jaeger, Grafana Tempo and OTEL collector should accept the traces.

The Grafana server already changed to replace Jager (Uber) lib to OTEL (OpenTelemetry):
grafana/grafana#67200

I think, the maintainers will accept only the OTEL, which supports the most important original JAEGER_* env variables, for backward compatibility (as much as possible).

The example Mimir tracing examples use JAEGER_AGENT_PORT:6831 (UDP), which is the Compact Thrift protocol and won't be supported by Jaeger 2: https://docs.google.com/document/d/18B1yTMewRft2N0nW9K-ecVRTt5VaNgnrPTW1eL236t4/edit?tab=t.0#heading=h.705d1ttjradb .

Below JAEGER_* env variables are used by Mimir examples, tests and dev environments (example):

"JAEGER_AGENT_HOST=jaeger"
"JAEGER_AGENT_PORT=6831"
"JAEGER_REPORTER_MAX_QUEUE_SIZE=1000"
"JAEGER_SAMPLER_PARAM=1"
"JAEGER_SAMPLER_TYPE=const"
"JAEGER_TAGS=app=backend"

The dev environments use jaegertracing/all-in-one image, which listen on :14268 (HTTP), not on 6831 (UDP). Strange.

Below source codes use the Uber Jaeger lib directly in Mimir, but not too much:

pkg/frontend/querymiddleware/results_cache.go:  "github.com/uber/jaeger-client-go"
pkg/util/tracing/tracing.go:    "github.com/uber/jaeger-client-go"
cmd/mimir-continuous-test/main.go:      jaegercfg "github.com/uber/jaeger-client-go/config"
cmd/query-tee/main.go:  jaegercfg "github.com/uber/jaeger-client-go/config"
cmd/mimir/main.go:      jaegercfg "github.com/uber/jaeger-client-go/config"

Most of source code use github.com/grafana/dskit/tracing , so the solution should be at dskit side. It depends on Uber Jaeger library. I don't know is there any pure OTEL lib, which has similar feature set and behavior to Uber Jaeger library, in order to substitute the Uber Jaeger library functionality.

If the maintainers take a decision to replace the Uber Jaeger library to a pure OTEL-based library, I can believe, related PRs will be accepted. Otherwise, making a PR for it will be waste time.

Related issues:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants