You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Nov 17, 2022. It is now read-only.
但是到爬取到一定的时候,还是会出现disconnect的错误。
done : https://wizardforcel.gitbooks.io/python-quant-uqer/content/81.html
Traceback (most recent call last):
File "gitbook.py", line 5, in
Gitbook2PDF(url).run()
File "E:\code\pythonCode\thirdparty\gitbook2pdf-master\gitbook2pdf\gitbook2pdf.py", line 202, in run
loop.run_until_complete(self.crawl_main_content(content_urls))
File "d:\ProgramData\Anaconda3\envs\python36\lib\asyncio\base_events.py", line 468, in run_until_complete
return future.result()
File "E:\code\pythonCode\thirdparty\gitbook2pdf-master\gitbook2pdf\gitbook2pdf.py", line 224, in crawl_main_content
await asyncio.gather(*tasks)
File "E:\code\pythonCode\thirdparty\gitbook2pdf-master\gitbook2pdf\gitbook2pdf.py", line 246, in gettext
metatext = await request(url, self.headers)
File "E:\code\pythonCode\thirdparty\gitbook2pdf-master\gitbook2pdf\gitbook2pdf.py", line 21, in request
async with session.get(url, headers=headers, timeout=timeout) as resp:
File "d:\ProgramData\Anaconda3\envs\python36\lib\site-packages\aiohttp\client.py", line 1005, in aenter
self._resp = await self._coro
File "d:\ProgramData\Anaconda3\envs\python36\lib\site-packages\aiohttp\client.py", line 497, in _request
await resp.start(conn)
File "d:\ProgramData\Anaconda3\envs\python36\lib\site-packages\aiohttp\client_reqrep.py", line 844, in start
File "d:\ProgramData\Anaconda3\envs\python36\lib\site-packages\aiohttp\streams.py", line 588, in read
await self._waiter
aiohttp.client_exceptions.ServerDisconnectedError: None
The text was updated successfully, but these errors were encountered:
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
使用命令:Python版本 3.6.5
python gitbook.py https://wizardforcel.gitbooks.io/python-quant-uqer/content/
根据爬取的日志,定位代码,优化了一个地方:增加了休眠时间
async def gettext(self, index, url, level, title):
'''
return path's html
'''
但是到爬取到一定的时候,还是会出现disconnect的错误。
done : https://wizardforcel.gitbooks.io/python-quant-uqer/content/81.html
Traceback (most recent call last):
File "gitbook.py", line 5, in
Gitbook2PDF(url).run()
File "E:\code\pythonCode\thirdparty\gitbook2pdf-master\gitbook2pdf\gitbook2pdf.py", line 202, in run
loop.run_until_complete(self.crawl_main_content(content_urls))
File "d:\ProgramData\Anaconda3\envs\python36\lib\asyncio\base_events.py", line 468, in run_until_complete
return future.result()
File "E:\code\pythonCode\thirdparty\gitbook2pdf-master\gitbook2pdf\gitbook2pdf.py", line 224, in crawl_main_content
await asyncio.gather(*tasks)
File "E:\code\pythonCode\thirdparty\gitbook2pdf-master\gitbook2pdf\gitbook2pdf.py", line 246, in gettext
metatext = await request(url, self.headers)
File "E:\code\pythonCode\thirdparty\gitbook2pdf-master\gitbook2pdf\gitbook2pdf.py", line 21, in request
async with session.get(url, headers=headers, timeout=timeout) as resp:
File "d:\ProgramData\Anaconda3\envs\python36\lib\site-packages\aiohttp\client.py", line 1005, in aenter
self._resp = await self._coro
File "d:\ProgramData\Anaconda3\envs\python36\lib\site-packages\aiohttp\client.py", line 497, in _request
await resp.start(conn)
File "d:\ProgramData\Anaconda3\envs\python36\lib\site-packages\aiohttp\client_reqrep.py", line 844, in start
File "d:\ProgramData\Anaconda3\envs\python36\lib\site-packages\aiohttp\streams.py", line 588, in read
await self._waiter
aiohttp.client_exceptions.ServerDisconnectedError: None
The text was updated successfully, but these errors were encountered: