Goodreads Datasets

Creators:
Wan, Mengting; McAuley, Julian
Publication Date:
2017
Data Category:
Dataset Description:
The Goodreads Datasets provide a large-scale collection of book-related data, making them valuable for analyzing reading behavior, book popularity, and recommendation systems. They contain rich metadata on over 2.3 million books, including titles, authors, publication years, genres, and average ratings. Additionally, they feature nearly 229 million user-book interactions, capturing explicit preferences such as bookshelf assignments ("read," "to-read") and user ratings. The dataset also covers a vast collection of user-generated textual reviews, offering insights into reader sentiments and opinions. The Goodreads Datasets contain three primary components: book metadata, user-book interactions, and book reviews. The book metadata includes details on 2,360,655 books, such as title, author, publication date, and genre. The user-book interactions dataset comprises 228,648,342 interactions from 876,145 users, capturing activities like adding books to shelves ("read," "to-read") and providing ratings. The book reviews dataset contains detailed textual reviews written by users, providing insights into reader sentiments and book popularity.The total size of the combined datasets is approximately 11 GB, available in JSON and CSV formats.
Variables:
Details:

Sign In

Register

Reset Password

Please enter your username or email address, you will receive a link to create a new password via email.