Skip to content
This repository has been archived by the owner on Sep 11, 2024. It is now read-only.

Commit

Permalink
Merge pull request #34 from climatepolicyradar/fix/text-blocks-should…
Browse files Browse the repository at this point in the history
…-always-be-sequence

Fix bug leading to None sequences of text blocks
  • Loading branch information
joel-wright authored Aug 23, 2023
2 parents 591171d + ae44fea commit 8c168ee
Showing 1 changed file with 2 additions and 3 deletions.
5 changes: 2 additions & 3 deletions src/cpr_data_access/parser_models.py
Original file line number Diff line number Diff line change
Expand Up @@ -214,13 +214,12 @@ def check_html_pdf_metadata(cls, values):

return values

def get_text_blocks(self, including_invalid_html=False):
def get_text_blocks(self, including_invalid_html=False) -> Sequence[TextBlock]:
"""A method for getting text blocks with the option to include invalid html."""
if self.document_content_type == CONTENT_TYPE_HTML and self.html_data:
if not including_invalid_html and not self.html_data.has_valid_text:
return []
else:
return self.text_blocks
return self.text_blocks

@property
def text_blocks(self) -> Sequence[TextBlock]:
Expand Down

0 comments on commit 8c168ee

Please sign in to comment.