Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for reading in JSON-based nodes format (to run pg_query_deparse) #178

Open
jgoux opened this issue Feb 7, 2023 · 7 comments

Comments

@jgoux
Copy link

jgoux commented Feb 7, 2023

Hello 👋,

Would it be possible to expose a pg_query_deparse which would directly work on the result of pg_query_parse (so skipping the protobuf layer)?

My goal would be to have the deparsing capability of libpg_query exposed to the JavaScript world through libpg-query-node.

I'm not a C programmer, so I'm not sure about the possibility/difficulty of the task. 😅

@lfittl
Copy link
Member

lfittl commented Feb 7, 2023

@jgoux Can you share more about the use case?

Generally the idea is that you'd parse, modify the parse tree, and then deparse - and protobuf is the way the parsetree gets passed back and forth.

@jgoux
Copy link
Author

jgoux commented Feb 7, 2023

@lfittl libpg-query-node seems to rely on pg_query_parse.

Reading the tests of libpg_query it doesn't seem possible to pass the result of pg_query_parse to pg_query_deparse_protobuf right?

So I'm wondering why is there a pg_query_parse + pg_query_parse_protobuf but not a pg_query_deparse + pg_query_deparse_protobuf?

Does it mean that libpg-query-node should rely on pg_query_parse_protobuf instead?

My use case is really just to be able to pass what I obtain from query.parseQuery(sql)/parseQuerySync from libpg-query-node (after modifying the parsetree) to a deparser function. 😄

@jgoux
Copy link
Author

jgoux commented Feb 7, 2023

I guess what I want is the equivalent of pg_query_parse for pg_query_deparse, working with json instead of protobuf to pass the parsetree:

tree_json = pg_query_nodes_to_json(parsetree_and_error.tree);

@lfittl
Copy link
Member

lfittl commented Feb 8, 2023

I guess what I want is the equivalent of pg_query_parse for pg_query_deparse, working with json instead of protobuf to pass the parsetree:

tree_json = pg_query_nodes_to_json(parsetree_and_error.tree);

Ah, thanks for clarifying!

There is currently no support for reading in the JSON format (to turn it into C structs again, which are used by the deparser) - that was part of the motivation for the Protobuf work, since its a bit more straightforward to read than JSON.

Its not impossible to add this, but its a lot more work than just exposing a function, as what would be needed is essentially pg_query_json_to_nodes, similar to the pg_query_protobuf_to_nodes function: https://github.com/pganalyze/libpg_query/blob/15-latest/src/pg_query_readfuncs_protobuf.c#L154

Note that matters are complicated a bit by the fact that we currently don't have a JSON parser bundled in the C library.

This may be something we can add in a future version, as Postgres 16 had major rework of how the out/read/copy funcs code works, and it may get easier to add extra formats with that work in place.

But for now this is not a priority on our end, though a PR would of course be welcome.

@lfittl lfittl changed the title Expose a pg_query_deparse function without protobuf Add support for reading in JSON-based nodes format (to run pg_query_deparse) Feb 8, 2023
@jgoux
Copy link
Author

jgoux commented Feb 8, 2023

Just so I'm sure I'm understanding how the protobuf work impacts the consumption of libpg_query.

Is it an implementation detail?

For example, if I would like to expose pg_query_parse_protobuf and pg_query_deparse_protobuf to the JavaScript/TypeScript world, would I have to deal with protobuf in my own code, or would it be possible to use it like:

import { parse, deparse } from "libpg-query";

const tree = parse("SELECT 1;");

// tree here would be a normal JS object
console.log(tree.stmts[0].stmt.SelectStmt);

// we pass the normal JS object
const query = deparse(tree);

// output "select 1"
console.log(query)

Or is the JSON-based nodes format necessary to have such an API from the JS/TS world?

@lfittl
Copy link
Member

lfittl commented Feb 8, 2023

You would have to have a step that turns the Protobuf into Javascript objects (e.g. using a library like protobufjs), and then does the same in reverse before passing things back to libpg_query.

For reference, the protobuf definition can be found here: https://github.com/pganalyze/libpg_query/blob/15-latest/protobuf/pg_query.proto#L17

And you can e.g. look at the Ruby library for how it does that: parse, deparse (note how the Ruby object "tree" gets passed to the auto-generated Protobuf message PgQuery::ParseResult which can decode/encode the Ruby tree into a Protobuf message)

@jgoux
Copy link
Author

jgoux commented Feb 8, 2023

Excellent, exactly what I was looking for, thanks for the pointer @lfittl! 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants