Web Scraping

Overview: Scrape questions, answers, comments, and metadata from StackOverflow, specifically questions about R, starting at the URL.

start at this first page of the search results on stackoverflow
get the links to the questions on this page
get the next page of results and the links to the questions on that
process the questions on the first 3 and last page of results of the search results, fetching 50 results/questions per page

Task

The goal is to explore the source of the HTML pages for the search results to find HTML structures identifying the elements of interest described below:

the number of views of the question
the number of votes
the text of the question
the tags for the question
when the question was posted
the user/display name of the person posting the question, their reputation, and how many gold, silver, and bronze badges they have
who edited the question and when
for each answer/comment, find: the text, the person who posted, when they posted, and their reputation and badge information

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
web-scraping.Rmd		web-scraping.Rmd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Web Scraping

Task

About

Releases

Packages

wanzhuz/web-scraping

Folders and files

Latest commit

History

Repository files navigation

Web Scraping

Task

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages