-
Notifications
You must be signed in to change notification settings - Fork 485
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
parsers: lines: support multiple occurrence of blocks to parse #423
Conversation
rmilecki
commented
Oct 16, 2022
Thanks for this contribution, One suggestion though, It would be nice to have some kind of feedback in the logger when when there is no match on the line start or line end regex. (This is also unavailable before this PR) |
@rmilecki Can you fix the conflicts? |
402c23a
to
e366025
Compare
@bosd: rebased & added warnings for parsing problems |
e366025
to
64783bd
Compare
@bosd: I see you rebased this again, my first thought was you want to merge or approve this. Can I ask if you had a chance to review those changes? |
@rmilecki I'm about to give my approval on this one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
So far lines parser was looking for only 1 block defined by "start" and "end" RegEx-es. Some invoices may have lines of the same set in muliple blocks. They can be separated by some random content or page footer & header. To support such cases use "start" and "end" to find as many blocks to parse as possible. This is (hopefully) cleanly implemented by: 1. Renaming parse() to parse_block() and making it work with a single block (already extracted from invoice content) 2. Making new parse() find blocks one by one This feature has been requested as a way of dealing with some multi-page invoices. Signed-off-by: Rafał Miłecki <[email protected]>
64783bd
to
ce652c5
Compare
Please do! Waiting for it! |