-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
importinto/lightning: check max row size when parsing csv to avoid OOM #58592
Conversation
Hi @D3Hunter. Thanks for your PR. PRs from untrusted users cannot be marked as trusted with I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #58592 +/- ##
================================================
+ Coverage 73.5283% 74.3149% +0.7865%
================================================
Files 1680 1710 +30
Lines 464577 478215 +13638
================================================
+ Hits 341596 355385 +13789
+ Misses 102128 101300 -828
- Partials 20853 21530 +677
Flags with carried forward coverage won't be shown. Click here to find out more.
|
@@ -398,7 +398,7 @@ func (parser *CSVParser) readUntil(chars *byteSet) ([]byte, byte, error) { | |||
var buf []byte | |||
for { | |||
buf = append(buf, parser.buf...) | |||
if len(buf) > LargestEntryLimit { | |||
if parser.checkRowLen && parser.pos-parser.rowStartPos+int64(len(buf)) > int64(LargestEntryLimit) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in all cases, parser.checkRowLen
is true when running here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only public API check it, i.e. ReadRow/ReadColumns/ReadUntilTerminator
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: GMHDBJD, lance6716 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What problem does this PR solve?
Issue Number: close #58590
Problem Summary:
What changed and how does it work?
Check List
Tests
import a file with 20 fields, each field is about 10+M, total row size is about 266M
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.