Doctolaria Web Crawler

It is a web Crawler and Scraper to extract data from doctolaria site.

The informations collected are:

How to install and Run

After activate your Python Virtual Environment (venv) run the below command to install the dependencies:

pip install -r requirements.txt

chromedriver.exe - Web driver used by Selenium to call Chrome. This executable is for Windows x64. If you are not confident to use this .exe file, OR have another Operation System, you can download the correct version at Selenium Chrome webdriver

python DoctoraliaWebCrawler.py

The pagination are limited to 100 pages and locked to 20 doctors per page
A doctor can have multiple addresses. In this project we are only extracting the First Address and the Telephones for this address

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.gitignore		.gitignore
Doctor.py		Doctor.py
DoctoraliaWebCrawler.py		DoctoraliaWebCrawler.py
README.md		README.md
chromedriver.exe		chromedriver.exe
doctoralia_extract_2019-07-02_001102.csv		doctoralia_extract_2019-07-02_001102.csv
requirements.txt		requirements.txt