groups

Showing 1-2 of 2 results

Youtube social network and ground-truth communities

Creators: Yang, Jaewon; Leskovec, Jure
Publication Date: 2012
Creators: Yang, Jaewon; Leskovec, Jure

Youtube is a video-sharing web site that includes a social network. In the Youtube social network, users form friendships and can create groups which other users can join. We consider such user-defined groups as ground-truth communities. This data is provided by Alan Mislove et al. We regard each connected component in a group as a separate ground-truth community. We remove the ground-truth communities which have less than 3 nodes. We also provide the top 5,000 communities with highest quality which are described in our paper. As for the network, we provide the largest connected component. This data collected in 2012 is particularly valuable for studying community structures, information diffusion, and network dynamics within large-scale social platforms. It has a size of 0,001 GB and comprises 1,134,890 nodes (users) and 2,987,624 edges (friendship links), reflecting the complex web of user interactions on YouTube. Additionally, it identifies 8,385 ground-truth communities, which are user-defined groups that provide insights into the natural clustering within the network.

Structurally, the dataset includes:

  • Network Data: An undirected graph where nodes represent users and edges denote mutual friendships. This graph captures the largest connected component of the YouTube user network, ensuring a cohesive representation of user interactions.

  • Community Data: Ground-truth communities derived from user-defined groups. Each connected component within these groups is considered a separate community, with only those containing at least three nodes included. For enhanced analysis, the dataset also provides the top 5,000 communities with the highest quality, as detailed in the accompanying research paper.

Amazon product co-purchasing network and ground-truth communities

Creators: Yang, Jaewon; Leskovec, Jure
Publication Date: 2012
Creators: Yang, Jaewon; Leskovec, Jure

This dataset provides a comprehensive view of product relationships on Amazon, based on the “Customers Who Bought This Item Also Bought” feature. Products are represented as nodes, and an undirected edge between two products signifies frequent co-purchasing, reflecting consumer buying patterns and product associations. ​If a product i is frequently co-purchased with product j, the graph contains an undirected edge from i to j. Each product category provided by Amazon defines each ground-truth community. We regard each connected component in a product category as a separate ground-truth community. We remove the ground-truth communities which have less than 3 nodes. We also provide the top 5,000 communities with highest quality which are described in our paper. The dataset has a size of 0,01 GB and encompasses 334,863 nodes (products) and 925,872 edges (co-purchasing relationships).

The dataset is structured into:

  • Network Data: An undirected graph where nodes represent products, and edges indicate co-purchasing relationships.

  • Ground-Truth Communities: Each product category defined by Amazon serves as a ground-truth community. Connected components within these categories are treated as separate communities, excluding those with fewer than three nodes. Additionally, the dataset provides the top 5,000 communities with the highest quality, as detailed in the associated research paper.

Sign In

Register

Reset Password

Please enter your username or email address, you will receive a link to create a new password via email.