Skip to content
This repository has been archived by the owner on Nov 17, 2022. It is now read-only.

The script cannot handle table DOM #73

Open
giraffeAlpaca opened this issue Nov 2, 2020 · 0 comments
Open

The script cannot handle table DOM #73

giraffeAlpaca opened this issue Nov 2, 2020 · 0 comments
Labels
bug Something isn't working

Comments

@giraffeAlpaca
Copy link

https://warsier.gitbooks.io/new_master_rule/content/3/32/322/3222.html
Traceback (most recent call last): File "gitbook.py", line 309, in <module> Gitbook2PDF(url).run() File "gitbook.py", line 195, in run loop.run_until_complete(self.crawl_main_content(content_urls)) File "/home/ilab/.conda/envs/gitbook/lib/python3.7/asyncio/base_events.py", line 587, in run_until_complete return future.result() File "gitbook.py", line 217, in crawl_main_content await asyncio.gather(*tasks) File "gitbook.py", line 238, in gettext text = ChapterParser(metatext, title, level, ).parser() File "gitbook.py", line 102, in parser return html.unescape(ET.tostring(context).decode()) File "src/lxml/etree.pyx", line 3351, in lxml.etree.tostring File "src/lxml/serializer.pxi", line 139, in lxml.etree._tostring File "src/lxml/serializer.pxi", line 199, in lxml.etree._raiseSerialisationError lxml.etree.SerialisationError: IO_ENCODER

The script seems to go wrong when the page contains a table.

@fuergaosi233 fuergaosi233 added the bug Something isn't working label Dec 13, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants