Huge Collection of Reddit Votes

Creators:
Leake, Joseph
Publication Date:
2020
Data Category:
Dataset Description:
The dataset covers data of over 44 million upvotes and downvotes cast by Reddit users between 2007 and 2020. This is a tab-delimited list of votes cast by reddit users who have opted-in to make their voting history public. Each row contains the submission id for the thread being voted on, the subreddit the submission was located in, the epoch timestamp associated with the vote, the voter's username, and whether it was an upvote or a downvote. The votes included are from users who have chosen to make their voting history public, ensuring compliance with privacy preferences. There's a separate file containing information about the submissions that were voted on. The dataset contains over 44 million voting records and has a size of 21,9 kB. Structurally, the dataset is organized into two main components:
  1. Votes Data: A tab-delimited file where each row represents a vote with the following fields:

    • submission_id: Identifier of the Reddit submission that received the vote.

    • subreddit: Name of the subreddit where the submission was posted.

    • created_time: Epoch timestamp indicating when the vote was cast.

    • username: Reddit username of the voter.

    • vote: Type of vote, either 'upvote' or 'downvote'.

  2. Submissions Data: A separate file containing information about the submissions that received votes, including details such as submission titles, authors, and timestamps.

Variables:
Details:

Sign In

Register

Reset Password

Please enter your username or email address, you will receive a link to create a new password via email.