Skip to content

Sign In
Sign In

Sign In
Sign In

Categories (optional)

All Categories
- Open Dataset

Resource Tags / Keywords / Subjects

Publication Date

License

Apache License 2.0
Creative Commons Attribution 4.0 International
Creative Commons Attribution Share Alike 4.0 International
Creative Commons Zero v1.0 Universal
GNU General Public License v3.0 or later
MIT License

online communities

Showing 1-2 of 2 results

Sort by

Apache License 2.0
Creative Commons Attribution 4.0 International
Creative Commons Attribution Share Alike 4.0 International
Creative Commons Zero v1.0 Universal
GNU General Public License v3.0 or later
MIT License

Amazon product co-purchasing network metadata

Creators: Leskovec, Jure

Publication Date: 2006

Creators: Leskovec, Jure

The data was collected by crawling the Amazon website and contains product metadata and review information about 548,552 different products (Books, music CDs, DVDs and VHS video tapes). It is valuable for analyzing product relationships, customer behavior, and the dynamics of product co-purchasing networks. For each product the following information is available:

Title
Salesrank
List of similar products (that get co-purchased with the current product)
Detailed product categorization
Product reviews: time, customer, rating, number of votes, number of people that found the review helpful.

The data was collected in summer 2006. It has a size of 201 MB and structured into:

Product Metadata: Information such as product ID, ASIN, title, group, sales rank, similar products, and categories.
Product Reviews: Details including review time, customer ID, rating, number of votes, and helpfulness votes.

Food.com Recipe & Review Data

Creators: Majumder, Bodhisattwa P.; Li, Shuyang; Ni, Jianmo; McAuley, Julian

Publication Date: 2019

Creators: Majumder, Bodhisattwa P.; Li, Shuyang; Ni, Jianmo; McAuley, Julian

This dataset consists of 180K+ recipes and 700K+ recipe reviews covering 18 years of user interactions and uploads on Food.com (formerly GeniusKitchen), an online recipe aggregator. This extensive collection allows for in-depth analysis of culinary trends, user preferences, and recipe characteristics over nearly two decades.The dataset is 0,85 GB in size and contains three sets of data from Food.com:Interaction splits

interactions_test.csv
interactions_validation.csv
interactions_train.csv

Preprocessed data for result reproduction

In this format, the recipe text metadata is tokenized via the GPT subword tokenizer with start-of-step, etc. tokens.

PP_recipes.csv
PP_users.csv

To convert these files into the pickle format required to run our code off-the-shelf, you may use pandas.read_csv and pandas.to_pickle to convert the CSV’s into the proper pickle format.

Follow Us

LinkedIn
Mastodon
Bluesky

© 2024 openBIGdata.org - a service of BERD@NFDI

Privacy Policy
Terms of Use
About
Get in Touch
Accessibility Statement

Sign In

Username or Email

Password

New to BERD@NFDI? Register

Forgot password?

Register

Institutional or working email address

Password

Title (optional)

First Name

Last Name

Research Affiliation / Organization / Company

URL to verify your profile (e.g. LinkedIn, personal website)

Already have an account? Sign In

Reset Password

Please enter your username or email address, you will receive a link to create a new password via email.

Username or Email

Manage Consent

To provide the best experiences, we use technologies like cookies to store and/or access device information. Consenting to these technologies will allow us to process data such as browsing behavior or unique IDs on this site. Not consenting or withdrawing consent, may adversely affect certain features and functions.

Functional Functional Always active

The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.

Preferences Preferences

The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.

Statistics Statistics

The technical storage or access that is used exclusively for statistical purposes. The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.

Marketing Marketing

The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.

Manage options Manage services Manage {vendor_count} vendors Read more about these purposes

View preferences

{title} {title} {title}