Scrapy is an open source and collaborative framework for extracting the data you need from web pages.
To install scrapy:
open command prompt as administrator
type in console line:
pip install scrapy
create a project
In command prompt navigate to the directory that you want your web web crawler project to be stored. Then enter:
scrapy startproject tutorial
This will create a tutorial directory with the following elements
tutorial/ scrapy.cfg # deploy configuration file tutorial/ # project's Python module, you'll import your code from here __init__.py items.py # project items definition file middlewares.py # project middlewares file pipelines.py # project pipelines file settings.py # project settings file spiders/ # a directory where you'll later put your spiders __init__.py