Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deparser incorrectly deparses DEFAULT expression in CREATE TABLE query #188

Open
mpokryva opened this issue Apr 25, 2023 · 5 comments
Open

Comments

@mpokryva
Copy link

mpokryva commented Apr 25, 2023

The following query (taken from the Postgres regression tests) unfortunately produces an invalid query when ran through the deparser:
CREATE TABLE error_tbl (b1 bool DEFAULT (1 IN (1, 2))) produces CREATE TABLE error_tbl (b1 bool DEFAULT 1 IN (1, 2)) (note the missing parentheses).

When I run the above in psql, I get ERROR: syntax error at or near "IN".

Let me know what I can do to help to fix the issue! This library has been super helpful, really appreciate all the work y'all have put in.

@lfittl
Copy link
Member

lfittl commented Apr 25, 2023

@mpokryva Thanks for the report!

It looks like we've omitted constraints.sql from the list of filenames for the regression tests used by the deparser, thus this wasn't noticed before.

I think that may have happened because of the intentional parsing error that gets triggered in the regression tests (essentially to cover the case you're describing), which we didn't used to handle correctly, but we now should be able to ignore.

To fix, I think we should do two things:

  1. Add constraints.sql to the list of regression tests the deparser tests against
  2. Figure out a way to add the parenthesis, without always adding them

On the second point: We could of course just explicitly add parenthesis here around the deparseExpr call here: https://github.com/pganalyze/libpg_query/blob/15-latest/src/pg_query_deparse.c#L4612 -- but that would then lead to odd statements like "DEFAULT (1)" instead of "DEFAULT 1".

I think what we may need to do is pass an extra flag to deparseExpr (e.g. isBExpr) that then sets the context for deparseAExpr to DEPARSE_NODE_CONTEXT_A_EXPR, thus adding the outer parens (see https://github.com/pganalyze/libpg_query/blob/15-latest/src/pg_query_deparse.c#L2888).

Do you want to try making a PR for this?

lelit added a commit to lelit/pglast that referenced this issue Apr 26, 2023
@mpokryva
Copy link
Author

I'll give it a shot within the next week 🤞. In crunch time right now:)

@rjuju
Copy link
Contributor

rjuju commented Sep 21, 2024

Hi,

as far as I can see this was fixed in b234708. However this patch was never backported, so the pg15 branch doesn't correctly handle NULLS NOT DISTINCT, and pg14 and pg15 branches don't correctly handle COMPRESSION. Any chance of backporting the fixes?

@lfittl
Copy link
Member

lfittl commented Sep 21, 2024

as far as I can see this was fixed in b234708. However this patch was never backported, so the pg15 branch doesn't correctly handle NULLS NOT DISTINCT, and pg14 and pg15 branches don't correctly handle COMPRESSION. Any chance of backporting the fixes?

Yep, I think you're right that this should be taken care of now for the 16 branch.

For backporting, its possible, but definitely a bit of work since we had a few changes on 16 beyond that commit (i.e. not sure how many conflicts there will be).

I'm actually wondering if this would be easier if we maintained the deparser in a version-independent way (i.e. with version if-defs), since we're about to branch for 17.

What's the use case where you need older versions of the deparser - are you using this code in a context within a running Postgres server (i.e. and as such don't control the base Postgres version), with a raw parse tree?

@rjuju
Copy link
Contributor

rjuju commented Sep 22, 2024

I'm actually wondering if this would be easier if we maintained the deparser in a version-independent way (i.e. with version if-defs), since we're about to branch for 17.

in my experience that simplifies the most common workflows (definitively easier to send a single PR, forces you to think about backward compatibility and so on), but can make the code harder to read. I don't think that there's really a lot of compatibility issues in that area though (on top of my head I only remember about the partition method representation that changed at some point), so it shouldn't be much of a problem. In any case I'm fine with both approaches.

What's the use case where you need older versions of the deparser - are you using this code in a context within a running Postgres server (i.e. and as such don't control the base Postgres version), with a raw parse tree?

yes exactly. I'm working on an extension to distribute DDL on multiple instances, and one requirement for that is to have fully deterministic query strings. It's definitely easier to e.g. qualify objects using the parsetree and then generate a new sql query from that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants