Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve data directory detection #454

Merged
merged 5 commits into from
Oct 4, 2023
Merged

Conversation

msakrejda
Copy link
Contributor

Currently, for gathering system stats on self-hosted systems, we look
up the current working directory of the postmaster process. This
usually works, but if customers have an unusual configuration, it can
find the wrong directory.

Instead, use the current method only as a fallback, and look at the
data_directory setting instead. This is readable with
pg_read_all_settings (part of pg_monitor), so it should be accessible
to us in most configurations.

@msakrejda msakrejda requested review from seanlinsley and a team September 13, 2023 01:31
Copy link
Contributor Author

@msakrejda msakrejda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not tested this yet. Opening for feedback on the approach first.

config/config.go Outdated Show resolved Hide resolved
config/config.go Outdated Show resolved Hide resolved
input/postgres/data_directory.go Outdated Show resolved Hide resolved
@msakrejda msakrejda force-pushed the improve-data-directory-detection branch from d0bab37 to ce59b50 Compare October 4, 2023 00:07
Currently, for gathering system stats on self-hosted systems, we look
up the current working directory of the postmaster process. This
usually works, but if customers have an unusual configuration, it can
find the wrong directory.

Instead, use the current method only as a fallback, and look at the
data_directory setting instead. This is readable with
pg_read_all_settings (part of pg_monitor), so it should be accessible
to us in most configurations.
@msakrejda msakrejda force-pushed the improve-data-directory-detection branch from ce59b50 to 06ba25a Compare October 4, 2023 03:53
@lfittl
Copy link
Member

lfittl commented Oct 4, 2023

I have not tested this yet. Opening for feedback on the approach first.

I'm curious, did you run an end to end test with this?

Wondering whether we should do a full test in a clean container/VM before we merge, or whether we feel its been sufficiently tested.

config/config.go Outdated Show resolved Hide resolved
input/postgres/data_directory.go Outdated Show resolved Hide resolved
main.go Outdated Show resolved Hide resolved
@msakrejda
Copy link
Contributor Author

@lfittl I've instrumented the code to verify that the correct setting is picked up, but I have not done an end-to-end test. I'm trying to figure out something that would do the wrong thing on main but the right thing here.

@msakrejda
Copy link
Contributor Author

Hmm, so I'm actually unable to reproduce the problem. If I start postgres on one filesystem partition and point it to a data directory on a different partition, /proc/<pid>/cwd actually points to the data directory. Tried this on both 15 and 10 with the same results. I think the change is safe, but I can't think of how to test it without being able to reproduce the problem. Maybe we should wait on this patch until we can reproduce the original problem.

@lfittl
Copy link
Member

lfittl commented Oct 4, 2023

Hmm, so I'm actually unable to reproduce the problem. If I start postgres on one filesystem partition and point it to a data directory on a different partition, /proc/<pid>/cwd actually points to the data directory. Tried this on both 15 and 10 with the same results. I think the change is safe, but I can't think of how to test it without being able to reproduce the problem. Maybe we should wait on this patch until we can reproduce the original problem.

I wonder if one way to reproduce it could be to have multiple Postgres installations?

i.e. if the Postgres PID search in https://github.com/pganalyze/collector/blob/main/helper/main.go#L22 gets the wrong process, it would lead to the detected data directory being incorrect.

@msakrejda
Copy link
Contributor Author

Oh good point. I'll try that.

@msakrejda
Copy link
Contributor Author

Okay, confirmed that situation can cause bogus reporting on main but is fixed with this branch. Another case that this should improve is if Postgres is running as a user other than "postgres" (probably not common, but maybe there are some obscure configurations or packaging setups out there that use "postgresql" or something else).

Copy link
Member

@lfittl lfittl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Some small remaining comments, but looks ready to merge overall.

And thanks for doing the extra testing!

main.go Show resolved Hide resolved
config/read.go Outdated Show resolved Hide resolved
config/config.go Outdated Show resolved Hide resolved
@msakrejda msakrejda merged commit 72998ea into main Oct 4, 2023
3 checks passed
@msakrejda msakrejda deleted the improve-data-directory-detection branch October 4, 2023 21:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants