Skip to content

Latest commit

 

History

History
65 lines (46 loc) · 2.26 KB

Readme-zh.md

File metadata and controls

65 lines (46 loc) · 2.26 KB

simple Spider

python -> 3.4+ coverage -> 37% build -> passing

     _              _       ___       _    _
 ___<_>._ _ _  ___ | | ___ / __> ___ <_> _| | ___  _ _
<_-<| || ' ' || . \| |/ ._>\__ \| . \| |/ . |/ ._>| '_>
/__/|_||_|_|_||  _/|_|\___.<___/|  _/|_|\___|\___.|_|
              |_|               |_|

英文

概述

一个简单的爬虫框架。 详细文档

开始入门

pip install sspider

You should construst project.py to suit your needs

   >>> from sspider import Spider, Request
   >>> # 建立request对象
   >>> request = Request('get', 'https://movie.douban.com/subject/27202819/reviews')
   >>> # 建立爬虫对象
   >>> spider = Spider()
   >>> # 运行爬虫
   >>> spider.run(request)
   ...
   >>> # 保存爬取结果
   >>> spider.write('test.txt)

python project.py

Ctrl-C to stop

相关文档

相关引用库

  • Using requests as htmlDownloader
  • Using lxml as default htmlParser
  • Using csv provide feature that export file as csv type
  • Using xlwt provide feature that export file as excel type
  • Using xlsxwriter provide feature that export file as xexcel type

项目结构

License

本项目采用 license 协议开源发布,请您在修改后维持开源发布,并为原作者额外署名,谢谢您的尊重。

若您需要将本项目应用于商业目的,请单独联系本人( @pengr ),获取商业授权。