-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support regex processing of DB role names in collector #194
Comments
Hi Joe, As Lukas mentioned offline, this makes sense in general. In some ways this is similar to #62 . And it would not require any server-side changes assuming that the "folding" happens consistently everywhere we reference roles/users in the collector. In terms of changes this would require, we (mostly) do not track object ownership, so the main places that would have to change are:
Does the "canonical" role (without the suffixes) actually exist in your system? If so (or if that's something you can create without too much trouble to accommodate this functionality), I think that makes things easier, since pg_stat_statements deals with actual user_ids. If we can fold these other roles to a real role, that avoids having to manage synthetic ids (and avoid collisions with real role ids). In terms of code changes, the pg_stat_statements case will either require the match to be pushed down into the query and a join with pg_roles (to map the pg_stat_statements userid to the canonical role), or to add a separate id mapping step after the main query if the role processing regexp is set. I think either approach can work. If we go for the join, we could use a left join and coalesce to the existing pg_stat_statements.userid to short-circuit the join if the regexp is not set. The log parsing case is probably simpler: the log parsing code in general is a little gnarly due to different prefix handling, but the user name could be massaged in the one place I linked for all of them, I think. I have not looked at the other role associations (the third bullet point), but I think this should be similar to what we would need to do for statements handling. Does this make sense? /cc @seanlinsley if you have any thoughts |
Hi @uhoh-itsmaciek, thanks for your response and additional info on where the changes would be needed.
That's a good question. As things stand, no - the default username template used by Vault for dynamic roles truncates the role to 8 chars in the generated role, although that should be pretty simple for us to change - we'll investigate. Do you think it makes sense to do an initial refactoring of the collector to introduce a type to represent the role, rather than just a string, so that operations on it (eg to sanitise it to the canonical role) can be centralised? I'm not too familiar with the collector code to know whether that would be too big a task and/or worth it. Thanks. |
Regarding the refactoring, my gut feeling is that if we go with the separate mapping step, that's probably easier as a shared helper function rather than a first-class type for roles. It could take a database connection and a list of role OIDs, and return the normalized OID. I'm certainly open to other options if you see a better approach. |
Hi @lfittl,
As per our call earlier this week (cc @robharrop), I'm opening an issue for your comment/approval on our proposed collector change, to allow processing/transformation of DB role names in the collector, before sending to the
pganalyze
server.Our specific use case is to handle Vault Dynamic Secrets, as the dynamically generated suffixes that Vault adds to the DB roles it creates cause
pganalyze
to treat statements that we would like to be grouped together as separate, so there is effectively no useful history for each statement. The proposed solution (using regex) supports our requirement, but I believe would also be generic enough to be useful for other contexts, even just literal string replacements.Detail
The DB role names generated by Vault Dynamic Secrets have two dynamically generated suffixes that we would like to trim and ignore.
For example, for role
v-aws-some-service-3tce39Yn1P2lV7RSQnMN-1628677987
, we want to ignore the two suffix groups, and just sendv-aws-some-service
topganalyze
.The proposed solution is to add some collector config options that accept a Sed-like search/replace pattern, eg.
It seems the built-in Go regex package supports the required regex syntax.
If you are OK in principle with this idea, we can look at fleshing out a bit more detail on this, eg a simple design (exactly what config options to provide and how), so please let us know. Also could you please confirm whether there's any server-side change that would be required to support this, or if it could be handled completely within the collector?
Thanks,
Joe.
The text was updated successfully, but these errors were encountered: