Skip to content

Commit

Permalink
Sort the PDFMiner text objects along the x axis before applying the g…
Browse files Browse the repository at this point in the history
…rouping algorithm, to avoid missing columns
  • Loading branch information
ollynowell authored and bosd committed Aug 13, 2024
1 parent 3068cac commit 2effecd
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions camelot/parsers/stream.py
Original file line number Diff line number Diff line change
Expand Up @@ -129,6 +129,7 @@ def _group_rows(text, row_tol=2):
rows = []
temp = []

text.sort(key=lambda x: (-x.y0, x.x0))
for t in text:
# is checking for upright necessary?
# if t.get_text().strip() and all([obj.upright for obj in t._objs if
Expand Down

0 comments on commit 2effecd

Please sign in to comment.