News Homepage Archive
Creators:
Jones, Nick
Publication Date:
2019
Data Category:
Dataset Description:
This project aims to provide a visual representation of how different media organizations cover various topics. Screenshots of the homepages of five different news organizations are taken once per hour, and made public thereafter. For each website, this amounts to 24 screenshots per day. Over a year, this results in approximately 8,760 screenshots per website. Screenshots are available at every hour starting from January 1, 2019. The size of the dataset is 1,8 MB. Currently, the only websites being tracked are:
nytimes.com;
washingtonpost.com;
cnn.com;
wsj.com;
foxnews.com;
By capturing hourly screenshots, this dataset offers a unique visual chronicle of news presentation, allowing for analysis of editorial choices, headline prominence, and the evolution of news stories across different media outlets. The dataset is organized hierarchically based on the website name and timestamp of each screenshot. Each sub-dataset corresponds to a specific news website, containing a chronological collection of its homepage screenshots. This structure facilitates targeted analysis of individual news outlets over time.
Variables:
Details: