AudioSet dataset
Creators:
Google
Publication Date:
2017
Data Category:
Dataset Description:
AudioSet consists of an expanding ontology of 632 audio event classes and a collection of 2,084,320 human-labeled 10-second sound clips drawn from YouTube videos. The ontology is specified as a hierarchical graph of event categories, covering a wide range of human and animal sounds, musical instruments and genres, and common everyday environmental sounds. The dataset has a size of 19,0 kB and is divided into three primary subsets:
-
Evaluation Set: Contains 20,383 segments from distinct videos, ensuring at least 59 examples for each of the 527 sound classes used.
-
Balanced Training Set: Consists of 22,176 segments from distinct videos, selected to provide a balanced representation with at least 59 examples per class.
-
Unbalanced Training Set: Includes 2,042,985 segments from distinct videos, representing the remainder of the dataset
Variables:
Details: