-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[enhance]: use byte array for packed kv meta to reduce meta size #146
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: shaoting-huang The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please change the title of the PR to a more readable one. Preferred format: [enhance/fix/feature]: [action: change/add/fix ...] to [result: improve/reduce/accelerate ...]
kv_metadata_->Append("row_group_sizes", value.substr(0, value.length() - 1)); | ||
std::vector<uint8_t> byteArray(row_group_sizes_.size() * sizeof(size_t)); | ||
std::memcpy(byteArray.data(), row_group_sizes_.data(), byteArray.size()); | ||
kv_metadata_->Append("row_group_sizes", std::string(byteArray.begin(), byteArray.end())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: make "row_group_sizes"
global constant.
cpp/src/packed/reader.cpp
Outdated
sizes.push_back(std::stoll(token)); | ||
} | ||
return sizes; | ||
auto parse_size = [&](const std::string& input) -> std::vector<size_t> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: the code of serialize and deserialize of row group size can be put together in some utility files.
cpp/src/packed/reader.cpp
Outdated
std::vector<uint8_t> byteArray = std::vector<uint8_t>(input.begin(), input.end()); | ||
std::vector<size_t> vec(byteArray.size() / sizeof(size_t)); | ||
std::memcpy(vec.data(), byteArray.data(), byteArray.size()); | ||
return vec; | ||
}; | ||
for (int i = 0; i < file_readers_.size(); ++i) { | ||
row_group_sizes_.push_back( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
->value(0)
: don't do that, the key_value_metadata
field may have multiple entries.
Signed-off-by: shaoting-huang <[email protected]>
b6526dc
to
65c1505
Compare
/lgtm |
issue: #127