Add `Result#columns` to store field types #124

ipc103 · 2023-10-03T20:49:58Z

Closes #99

This PR adds a new Result#columns method which returns an array of Result::Column objects to represent column information such as name, length, and type. At some point this could replaced the fields method, but as of now was added as a non-breaking change.

We added a new Columns class as well. When instantiating a new Result in the C extension, we assign a new instance of our Columns class which carries a pointer to a struct. After populating the properties of the result, we cache the column_info in a struct on the Columns Value and add the information in (including casting some integers into symbol values). We can then lazily-load the data out of the struct and populate the ruby array when calling #columns

Full disclosure: I've never written a C extension (or any C for that matter) before. Any and all feedback welcome - I added a few inline comments in the PR.

CC @composerinteralia @ywenc

ipc103 · 2023-10-03T20:50:42Z

contrib/ruby/ext/trilogy-ruby/cext.c

+#undef XX
+};
+
+static char *downcase(const char *str)


This is to convert the charset/type name into a lowercase symbol. I'm sure there is a better way to do this.

ipc103 · 2023-10-03T20:51:41Z

contrib/ruby/ext/trilogy-ruby/cext.c

+            Trilogy_Result_Column, rb_intern("new"), 6, trilogy_result_columns_ctx->column_info[i].name,
+            rb_id2sym(rb_intern(downcase(trilogy_type_names[trilogy_result_columns_ctx->column_info[i].type]))),
+            rb_int_new(trilogy_result_columns_ctx->column_info[i].len), rb_int_new(trilogy_result_columns_ctx->column_info[i].flags),
+            rb_id2sym(rb_intern(downcase(trilogy_charset_names[trilogy_result_columns_ctx->column_info[i].charset]))),


Wrapping rb_id2sym(rb_intern(...)) seemed to be a way to get the actual string-like symbol value i.e. :long. There is probably a better way to do this but I couldn't find one 🤷

Does this leak memory? Afaik rb_intern doesn't take ownership of the allocated memory but ends up copying it?

Thanks - given the fact that this already felt sub-optimal, I wonder if it makes more sense to just set an int here and then cast to the symbol on the Ruby class.

composerinteralia · 2023-10-13T15:14:33Z

contrib/ruby/lib/trilogy/result.rb

+
+
+    class Columns
+      include Enumerable


Do we need this to be Enumerable? It doesn't look like we return a Columns object anywhere (it's an intermediate so we have a place to hold the raw column data)—rather we call all on it and return an Array.

Good call - you are correct that we don't need this. Removed and updated the previous commit

(As an aside, not sure what the preference is for commit history here. Happy to squash these three down to a single commit if we prefer)

Towards trilogy-libraries#99, this exposes additional field data on the query result via a `Result#columns` method. We were already using this data in the C extension to determine how to cast the column values. We now add a pointer to the `column_info` struct onto the result and then lazily add the values to the `Result` instance when called. This method could eventually replace the `#fields` method which at the moment only includes the column name. Co-authored-by: Daniel Colson <[email protected]> Co-authored-by: Charlotte Wen <[email protected]>

We want to set some of these values as symbols instead of ints on the Column to make it easier to work with. In this commit, we created loopups for int to string using macros, then changed the type to a symbol via `rb_intern` + `rb_id2sym`. It's possible there's a single function to do this, but we couldn't find one. I also added a downcase function to make sure we get the values in lower instead of uppercase. I'm sure there is a better way to do this. I have no idea how to write C. Co-authored-by: Daniel Colson <[email protected]>

Previously, we had added a pointer to our column_info struct onto our `Result` for lazy loading the data. This resulted in changing the type of `Result` to `T_DATA` which could have negative performance implications. Instead, we can create a separate `Columns` class to store the pointer to our struct. A reference to `columns` now gets eagerly loaded onto the result, but we only populate the actual information for the columns when calling `Result#columns`. This seems like a good trade off for lazy loading (vs. putting the pointer directly on `Result`). Having said that, it's probably worth asking if the lazy loading is worth this extra complexity at all, or if we shoudl simply populate `@columns` with an already loaded array for the `Result` immediately.

byroot · 2023-11-28T20:09:03Z

contrib/ruby/ext/trilogy-ruby/cext.c

+struct trilogy_result_columns_ctx {
+    struct column_info *column_info;
+    uint64_t column_count;
+};


Suggested change

struct trilogy_result_columns_ctx {

struct column_info *column_info;

uint64_t column_count;

};

struct trilogy_result_columns_ctx {

uint64_t column_count;

struct column_info column_info[];

};

You could use a flexible array member to allocate the whole list at once and avoid pointer chasing and malloc churn.

ipc103 commented Oct 3, 2023

View reviewed changes

composerinteralia reviewed Oct 13, 2023

View reviewed changes

ipc103 force-pushed the add-field-types branch from 401a44f to d27a2c6 Compare October 13, 2023 17:09

ipc103 and others added 3 commits October 23, 2023 11:10

ipc103 force-pushed the add-field-types branch from d27a2c6 to 1131f77 Compare October 23, 2023 15:10

byroot reviewed Nov 28, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `Result#columns` to store field types #124

Add `Result#columns` to store field types #124

ipc103 commented Oct 3, 2023

ipc103 Oct 3, 2023

ipc103 Oct 3, 2023

dbussink Oct 26, 2023

ipc103 Nov 1, 2023

composerinteralia Oct 13, 2023

ipc103 Oct 13, 2023

byroot Nov 28, 2023

Add Result#columns to store field types #124

Are you sure you want to change the base?

Add Result#columns to store field types #124

Conversation

ipc103 commented Oct 3, 2023

ipc103 Oct 3, 2023

Choose a reason for hiding this comment

ipc103 Oct 3, 2023

Choose a reason for hiding this comment

dbussink Oct 26, 2023

Choose a reason for hiding this comment

ipc103 Nov 1, 2023

Choose a reason for hiding this comment

composerinteralia Oct 13, 2023

Choose a reason for hiding this comment

ipc103 Oct 13, 2023

Choose a reason for hiding this comment

byroot Nov 28, 2023

Choose a reason for hiding this comment

Add `Result#columns` to store field types #124

Add `Result#columns` to store field types #124