Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comparison aggregation unnest and where clause fix #5090

Closed
wants to merge 25 commits into from
Closed

Conversation

egor-ryashin
Copy link
Contributor

@egor-ryashin egor-ryashin commented Jun 17, 2024

Also, added Druid CI testing

@egor-ryashin egor-ryashin marked this pull request as ready for review June 18, 2024 17:38
@egor-ryashin egor-ryashin marked this pull request as draft June 18, 2024 18:02
@egor-ryashin egor-ryashin marked this pull request as ready for review June 19, 2024 20:11
@egor-ryashin egor-ryashin changed the title Comparison aggregation unnest testing Comparison aggregation unnest and where clause fix Jun 20, 2024
Comment on lines 87 to 89
if os.Getenv("DEBUG") == "1" {
fmt.Println("Reconciler error", validateErr.Error())
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It took me several hours of debugging without those log messages, I don't understand. Reconciler and validation errors aren't shown if tests fail

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You just need to uncomment this line to get reconciler logs from tests:

// logger, err := zap.NewDevelopment()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless it's a very special circumstance, we should only read ENV variables at the entrypoint (e.g. CLI command or test function) and then pass config explicitly to internal packages, and we should use the Zap logger consistently instead of fmt.Println.

Copy link
Contributor Author

@egor-ryashin egor-ryashin Jun 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless it's a very special circumstance, we should only read ENV variables at the entrypoint (e.g. CLI command or test function) I doubt that, in terms of speed of development I would rather add everywhere

if os.Getenv("DEBUG") == "1") {
fmt.Println(sql)
}

and forgot about adding/removing and checking fmt.Println(sql) all the time.
There's always one more place where one forgot to propagate .Debug variable and you need some hours of work to reverse-engineer and propagate it for a new test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is equivalent to using the debug logger. Whether you need to set DEBUG=1 or whether you uncomment the debug logger, it's the same overhead.

Copy link
Contributor Author

@egor-ryashin egor-ryashin Jun 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can keep DEBUG=1 in a config (ie vscode config).

Comment on lines 268 to 271
debug := os.Getenv("DEBUG")
if debug == "1" {
fmt.Println("Parser error", e.Message)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove

@@ -244,7 +260,7 @@ func NewInstanceForDruidProject(t TestingT) (*runtime.Runtime, string) {
{
Type: "druid",
Name: "druid",
Config: map[string]string{"dsn": "http://localhost:8888/druid/v2/sql"},
Config: map[string]string{"dsn": fmt.Sprintf("https://%s:%[email protected]/druid/v2/sql", splits[0], splits[1])},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove – we should use test containers instead of running against a prod cluster in testing. If we really need to run against an external cluster, it should be a dedicated cluster and generically configured

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think tests are better than no tests

Copy link
Contributor

@begelundmuller begelundmuller Jun 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we need to invest more in good testing for Druid, but we need to do it with proper abstractions so the tests are safe and easy to run for everybody. This both has the potential to overload a production cluster we rely on for more important use cases, makes it more difficult for folks internally to configure the tests, and it looks bad to open source contributors who don't get a chance to run or modify these tests.

Copy link
Contributor Author

@egor-ryashin egor-ryashin Jun 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and it looks bad to open source contributors who don't get a chance to run or modify these tests I will look even worse if we get production Druid-based errors
This both has the potential to overload a production cluster we rely on for more important use cases, makes it more difficult for folks internally to configure the tests @nishantmonu51 I wonder if we should start a dedicated cluster for CI tests?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A dedicated cluster would be better. Why do you consider a test container unsuitable for Druid testing? Regarding this PR, I think bigger changes to Druid testing improvements should be done in a dedicated PR.

Copy link
Contributor Author

@egor-ryashin egor-ryashin Jun 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a dedicated container implemented but it was switched off for performance reason. IE

=== RUN   TestDruid
    druid_test.go:83: druid: skipping test in short mode
--- SKIP: TestDruid (0.00s)

"Regarding this PR, I think bigger changes to Druid testing improvements should be done in a dedicated PR."
I expected it will be an easy addition but it turns we want to postpone Druid tests again - because we cannot find a perfect solution in short amount of time - we postpone any other solution entirely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does changing metrics-in to master-in cluster resolve the issue?

@@ -76,7 +76,7 @@ func lookupMetricsView(ctx context.Context, rt *runtime.Runtime, instanceID, nam

res, err := ctrl.Get(ctx, &runtimev1.ResourceName{Kind: runtime.ResourceKindMetricsView, Name: name}, false)
if err != nil {
return nil, nil, status.Error(codes.InvalidArgument, err.Error())
return nil, nil, status.Error(codes.InvalidArgument, fmt.Sprintf("%s, %s", err.Error(), name))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return nil, nil, status.Error(codes.InvalidArgument, fmt.Sprintf("%s, %s", err.Error(), name))
return nil, nil, status.Error(codes.InvalidArgument, fmt.Sprintf("error getting metrics view %q: %s", name, err.Error()))

@@ -189,6 +189,7 @@ func (e *Executor) Query(ctx context.Context, qry *Query, executionTime *time.Ti
if err != nil {
return nil, false, err
}
fmt.Println(sql)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove

Comment on lines 77 to 78
func (b *sqlBuilder) writeSelect(n *SelectNode, level int) error {
b.out.WriteString(strings.Repeat("\t", level))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove the pretty printing here? It's redundant extra length and allocations. If it's for debugging, there are many SQL formatters that can prettify a query (I use the adpyke.vscode-sql-formatter extension in VS Code)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meaning additional manual work in exchange for nano-optimisation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm more worried about the need for understanding, maintaining and debugging formatting code for no reason. There are plenty of decent SQL formatter libraries for Go if we have cases where we need to print approximately formatted SQL.

@egor-ryashin
Copy link
Contributor Author

CI tests part is moved here #5184

@egor-ryashin
Copy link
Contributor Author

It's untrivial at this point to merge it without CI tests, so i'm closing. I've checked the critical errors (order by measure, unnest) were fixed in main branch already. Other bugs I'll work on in another ticket.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants