simple Spider

     _              _       ___       _    _
 ___<_>._ _ _  ___ | | ___ / __> ___ <_> _| | ___  _ _
<_-<| || ' ' || . \| |/ ._>\__ \| . \| |/ . |/ ._>| '_>
/__/|_||_|_|_||  _/|_|\___.<___/|  _/|_|\___|\___.|_|
              |_|               |_|

中文

Overview

A simple web crawling framework.Document

Getting Started

pip install sspider

You should construst project.py to suit your needs

   >>> from sspider import Spider, Request
   >>> # 建立request对象
   >>> request = Request('get', 'https://movie.douban.com/subject/27202819/reviews')
   >>> # 建立爬虫对象
   >>> spider = Spider()
   >>> # 运行爬虫
   >>> spider.run(request)
   ...
   >>> # 保存爬取结果
   >>> spider.write('test.txt)

python project.py

Ctrl-C to stop

Referenced Document

Referenced Libraries

Using requests as htmlDownloader
Using lxml as default htmlParser
Using csv provide feature that export file as csv type
Using xlwt provide feature that export file as excel type
Using xlsxwriter provide feature that export file as xexcel type

Project structure

License

This project is published open source under agreement. Please maintain the open source release after modification and sign the name of the original author. Thank you for your respect

If you need to apply this project for commercial purposes, please contact me( @pengr ) separately to obtain commercial authorization

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
images		images
sspider		sspider
.gitignore		.gitignore
LICENSE		LICENSE
README.rst		README.rst
Readme-zh.md		Readme-zh.md
Readme.md		Readme.md
requirements.txt		requirements.txt
script.md		script.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

simple Spider

Overview

Getting Started

Referenced Document

Referenced Libraries

Project structure

License

About

Releases

Packages

Contributors 2

Languages

License

duiliuliu/simple-spiders

Folders and files

Latest commit

History

Repository files navigation

simple Spider

Overview

Getting Started

Referenced Document

Referenced Libraries

Project structure

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages