Showing 1-8 of 17 results

Innovation Idea Generation for a Bookstore's Holiday Promotion

Creators: Claus Hegmann-Napp, Tijmen P. J. Jansen
Publication Date: 2025-11-18
Creators: Claus Hegmann-Napp, Tijmen P. J. Jansen

This dataset contains 460 observations from an experimental study focused on idea generation. The goal of the study was likely to examine how individual differences and attitudes toward artificial intelligence (AI) influence the ideas produced for a specific task (holiday promotions for a bookstore).

The dataset features 460 rows (observations) and 9 variables, including the text of the generated ideas and a suite of participant metrics such as personality scores, self-reported task metrics, and ratings of AI opinions.

Behavioral Insights from Risk, Social, and Consumer Experiments

Creators: Andrew Ellis, London School of Economics; David J Freeman, Simon Fraser University
Publication Date: 2024-08-08
Creators: Andrew Ellis, London School of Economics; David J Freeman, Simon Fraser University

This dataset collection compiles microdata from a series of behavioral experiments conducted both online and in-person, designed to examine risk preferences, social decision-making, and consumer behavior. The studies capture how individuals react to uncertainty, fairness, and purchasing choices under varying conditions. The datasets include variables such as participant age, gender, treatment group, risk decisions, social payoff allocations, and product preference choices. These variables support analyses on the impact of experimental treatments on behavioral outcomes. The collection enables comparative research across experimental contexts and provides valuable insights into psychological and economic responses to incentives, pricing, and social dynamics.

Clubhouse data

Creators: Kaggle
Publication Date: 2024-10-23
Creators: Kaggle

The Clubhouse dataset on Kaggle provides data related to the social audio app Clubhouse. It contains information such as user demographics, room activity, user engagement metrics, and discussions held on the platform. The dataset is approximately 9.7 MB in size and comprises 1,300,515 user profiles. Each profile represents an individual observation, offering a substantial sample for analysis. It is particularly valuable for analyzing user demographics, social networking patterns, and engagement metrics within the platform. Structurally, the dataset is organized as a CSV file, with each row corresponding to a user’s profile and columns representing various attributes. The key variables included are:

  • user_ID: A unique identifier for each user on Clubhouse.

  • name: The display name of the user.

  • photo_url: URL of the user’s profile photo.

  • username: The username chosen by the user on Clubhouse.

  • Twitter: The user’s Twitter handle or linked Twitter account.

  • Instagram: The user’s Instagram handle or linked Instagram account.

  • num_followers: The number of followers the user has on Clubhouse.

  • num_following: The number of accounts the user is following on Clubhouse.

  • time_created: The date and time when the user’s account was created.

  • invited_by_user_profile: Profile information of the user who invited this user to Clubhouse.

Raw Bay Area Craigslist Rental Housing Posts

Creators: Pennigton, Kate
Publication Date: 2018
Creators: Pennigton, Kate

Like many cities, San Francisco doesn’t track rents. ​The Bay Area Craigslist Rental Housing Posts dataset comprises rental housing listings from the San Francisco Bay Area, spanning from 2000 to 2018. Each entry includes various attributes such as posting date, neighborhood, price, number of bedrooms and bathrooms, square footage, and geographic coordinates, facilitating in-depth analysis of housing trends. There are 200,796 individual rental listings documented in the dataset with a total size of 16,3 kB. The dataset is organized into three main components:

  1. Raw Data (2000-2012): This subset includes 167,090 entries with fields for posting date, title, and neighborhood.

  2. Raw Data (2013-2018): Comprising 58,551 entries, this subset offers more detailed information, including post ID, date, neighborhood, price, square footage, number of bedrooms, address, latitude, longitude, description, title, and details.

  3. Cleaned Data (2000-2018): This consolidated and processed dataset contains 200,796 entries with variables such as post ID, date, year, neighborhood, city, county, price, number of bedrooms, number of bathrooms, square footage, room type indicator, address, latitude, longitude, title, description, and details.

Amazon Brand and Exclusives

Creators: Jeffries, Adrianne; Yin, Leon
Publication Date: 2021
Creators: Jeffries, Adrianne; Yin, Leon
My co-author Adrianne Jeffries and I found Amazon gave its own branded products an advantage over better-rated competitors in search results. This repository contains code to reproduce the findings featured in our story “Amazon Puts Its Own ‘Brands’ First Above Better-Rated Products” and “When Amazon Takes the Buy Box, it Doesn’t Give it up” from our series Amazon’s Advantage. Each product in this dataset is identified by its unique Amazon Standard Identification Number (ASIN), facilitating precise tracking and analysis. They are categorized based on their association with Amazon, distinguishing between Amazon-owned brands, exclusive partnerships, and proprietary electronics. Data is derived from extensive web scraping of Amazon’s product listings, ensuring a comprehensive and up-to-date collection. The data collection occurred primarily in early 2021, with search results gathered in January 2021 and product pages in February 2021. In total, the dataset comprises 137,428 products, each represented by a unique ASIN and has a size of 221,0 kB. It is organized into several sub-datasets, each serving a specific analytical purpose:

  1. Amazon Private Label Products: Contains detailed information on 137,428 products identified as Amazon brands, exclusives, or proprietary electronics.

  2. Search Results: Includes parsed search result pages from top and generic searches, totaling 187,534 product positions.

  3. Product Pages: Comprises parsed product pages corresponding to the search results, encompassing 157,405 product pages.

  4. Training Set: Provides metadata used to train and evaluate machine learning models, with feature engineering conducted in associated Jupyter notebooks.

  5. Trademarks: Contains a dataset of trademarked brands registered by Amazon, collected from USPTO.gov and Amazon.

Google Restaurants

Creators: Zhankui He, Yan; Li, Jiacheng; Zhang, Tianyang; McAuley, Julian
Publication Date: 2022
Creators: Zhankui He, Yan; Li, Jiacheng; Zhang, Tianyang; McAuley, Julian

This is a mutli-modal dataset of restaurants from Google Local (Google Maps). Data includes images and reviews posted by users, as well as other metadata for each restaurant. The rich combination of textual reviews, numerical ratings, and visual content helps to provide a holistic view of user experiences and restaurant characteristics. Such a multi-faceted dataset is particularly valuable for developing and testing recommendation systems, conducting sentiment analysis, and exploring the relationships between visual content and user perceptions in the context of dining establishment. The total size of the dataset is approximately 120 GB and structured into:

  • Restaurant Metadata: Information such as restaurant names, locations, contact details, and operational hours.

  • User Reviews: Textual feedback and numerical ratings provided by users.

  • Images: Photographs uploaded by users, showcasing various aspects of the restaurants.

TripAdvisor European restaurants

Creators: (Leone, Stefano)
Publication Date: 2021
Creators: (Leone, Stefano)

TripAdvisor is the most popular travel website and it stores data for almost all restaurants, showing locations (even latitude and longitude coordinates), restaurant descriptions, user ratings and reviews, and many more aspects. The dataset is 0.68 GB large.

The TripAdvisor dataset includes 1,083,397 restaurants with attributes such as location data, average rating, number of reviews, open hours, cuisine types, awards, etc.

The dataset combines the restaurants from the main European countries, the data has been scraped in early May 2021.

The dataset is structured with various variables for each restaurant, such as:

  • restaurant_link: Unique TripAdvisor restaurant link.
  • restaurant_name: Name of the restaurant on TripAdvisor.
  • original_location: Original location displayed on TripAdvisor.
  • country: Country name retrieved from original_location.
  • region: Region name retrieved from original_location.
  • province: Province name retrieved from original_location.
  • city: City name retrieved from original_location.
  • address: Address displayed on TripAdvisor.
  • latitude: Latitude coordinate.
  • longitude: Longitude coordinate.
  • claimed: Indicates if the restaurant business is claimed on TripAdvisor.
  • awards: Award names.
  • popularity_detailed: Detailed popularity ranking.
  • popularity_generic: Generic popularity ranking (among all places to eat in the area).

Amazon product co-purchasing network metadata

Creators: Leskovec, Jure
Publication Date: 2006
Creators: Leskovec, Jure

The data was collected by crawling the Amazon website and contains product metadata and review information about 548,552 different products (Books, music CDs, DVDs and VHS video tapes). It is valuable for analyzing product relationships, customer behavior, and the dynamics of product co-purchasing networks. For each product the following information is available:

Title
Salesrank
List of similar products (that get co-purchased with the current product)
Detailed product categorization
Product reviews: time, customer, rating, number of votes, number of people that found the review helpful.

The data was collected in summer 2006. It has a size of 201 MB and structured into:

  • Product Metadata: Information such as product ID, ASIN, title, group, sales rank, similar products, and categories.

  • Product Reviews: Details including review time, customer ID, rating, number of votes, and helpfulness votes.

Sign In

Register

Reset Password

Please enter your username or email address, you will receive a link to create a new password via email.