Skip to content

Latest commit

 

History

History
20 lines (11 loc) · 767 Bytes

README.md

File metadata and controls

20 lines (11 loc) · 767 Bytes

Edmunds-Crawler :

A Crawler for Edmunds.com http://www.edmunds.com/

Edmunds is a leading customer review and opinion website for automobiles. This crawler ecxtracts customer reviews for some selected car models.

Language: Java

Library: JSoup

Note:

  1. Keep the URL Files in urls folder : Since different Car Models can have multiple urls that are needed to be crawled. ..* Sample URL files for Nissan Altima and Honda_Civic are available
  2. Create Reviews Folder : This is where crawled reviews will be stored. FileName will be same as that given in the urls folder. ..* Sample Review files will be available in this folder.
  3. While running EdmundsCrawler, if Proxy Authenication Issue comes up. Insert the proxy in the main method of EdmundsCrawler.java