
A large-scale dataset of manually annotated audio events.
AudioSet is a large-scale dataset of manually annotated audio events, designed to provide a common evaluation task for audio event detection and a starting point for a comprehensive vocabulary of sound events. It consists of an expanding ontology of 632 audio event classes and a collection of over 2 million human-labeled 10-second sound clips drawn from YouTube videos. The ontology is structured as a hierarchical graph of event categories, covering a wide range of human and animal sounds, musical instruments, and common environmental sounds. The data collection process involves human annotators verifying the presence of sounds within YouTube segments nominated based on metadata and content-based search. Machine-extracted features are available for download alongside the dataset, facilitating machine learning model training and evaluation.
AudioSet is a large-scale dataset of manually annotated audio events, designed to provide a common evaluation task for audio event detection and a starting point for a comprehensive vocabulary of sound events.
Explore all tools that specialize in audio event detection. This domain focus ensures AudioSet delivers optimized results for this specific requirement.
Explore all tools that specialize in sound classification. This domain focus ensures AudioSet delivers optimized results for this specific requirement.
Explore all tools that specialize in acoustic scene understanding. This domain focus ensures AudioSet delivers optimized results for this specific requirement.
Open side-by-side comparison first, then move to deeper alternatives guidance.
Verified feedback from other users.
No reviews yet. Be the first to rate this tool.
No direct alternatives found in this category.