AudioSet dataset

Creators:
Google
Publication Date:
2017
Data Category:
Dataset Description:
AudioSet consists of an expanding ontology of 632 audio event classes and a collection of 2,084,320 human-labeled 10-second sound clips drawn from YouTube videos. The ontology is specified as a hierarchical graph of event categories, covering a wide range of human and animal sounds, musical instruments and genres, and common everyday environmental sounds. The dataset has a size of 19,0 kB and is divided into three primary subsets:
  • Evaluation Set: Contains 20,383 segments from distinct videos, ensuring at least 59 examples for each of the 527 sound classes used.

  • Balanced Training Set: Consists of 22,176 segments from distinct videos, selected to provide a balanced representation with at least 59 examples per class.

  • Unbalanced Training Set: Includes 2,042,985 segments from distinct videos, representing the remainder of the dataset

Variables:
Details:

Sign In

Register

Reset Password

Please enter your username or email address, you will receive a link to create a new password via email.