Wikipedia. Extracts external citation URLs from the References section of one or more Wikipedia article URLs. Use read_urls to scrape content from those URLs.

fetch_wiki_refs(url, n = NULL)

Arguments

url

Character vector of full Wikipedia article URLs (e.g. from fetch_wiki_urls).

n

Maximum number of citation URLs to return per source page. Default NULL returns all; use a number (e.g. 10) to limit.

Value

For one URL, a data.table with columns source_url, ref_id, and ref_url. For multiple URLs, a named list of such data.tables (names are the Wikipedia article titles); elements are NULL for pages with no refs.

Examples

if (FALSE) { # \dontrun{
wiki_urls <- fetch_wiki_urls("January 6 Capitol attack")
refs_dt <- fetch_wiki_refs(wiki_urls[1])           # single URL: data.table
refs_list <- fetch_wiki_refs(wiki_urls[1:3])      # multiple: named list
articles <- read_urls(refs_dt$ref_url)
} # }