Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add schema visitor. #2548

Merged

Conversation

liurenjie1024
Copy link
Collaborator

This is the second part of introducing kudo, for a more complete pr please refer to #2532.

In this pr we introduced a post order visitor for schema and host column vectors.

Signed-off-by: liurenjie1024 <[email protected]>
@liurenjie1024
Copy link
Collaborator Author

build

Signed-off-by: liurenjie1024 <[email protected]>
@liurenjie1024
Copy link
Collaborator Author

build

jlowe
jlowe previously approved these changes Oct 30, 2024
Copy link
Member

@jlowe jlowe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall lgtm. Minor documentation comment and a question on asymmetric previsits.

* <li> Visit primitive field b1</li>
* <li> Visit list field B with results from b1 and previsit result. </li>
* <li> Visit primitive field c1</li>
* <li> Visit with results from columns A, B, and C</li>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This just says "Visit with ..." which begs the question of visit what? Is this visitTopSchema?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

T visitStruct(HostColumnVectorCore col, List<T> children);

/**
* Visit a list column before actually visiting its child.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit odd and asymmetric that lists get a previsit before children but structs do not. Curious why this is needed?

Copy link
Collaborator Author

@liurenjie1024 liurenjie1024 Oct 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because for list's child, we need know the offset and length of its child from list's offset buffer first. For example, for list<int>, we want to serialize from [3, 9), which is list's offset and length, but the offset and length of its child is unknown without actually visiting list's offset buffer.

@liurenjie1024
Copy link
Collaborator Author

build

Copy link
Collaborator

@GaryShen2008 GaryShen2008 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved the comment fixing after Jason's approval.

@liurenjie1024 liurenjie1024 merged commit 7ee7b1c into NVIDIA:branch-24.12 Oct 31, 2024
3 checks passed
@liurenjie1024 liurenjie1024 deleted the ray/kudo-schema-utils branch November 1, 2024 02:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants