
TechRxiv
A preprint server for health sciences.

A large conversational telephone speech corpus for speech recognition and speaker identification research.

Switchboard-1 Release 2 is a corpus of approximately 260 hours of conversational telephone speech developed by Texas Instruments and distributed by the Linguistic Data Consortium (LDC). It comprises around 2,400 two-sided telephone conversations among 543 speakers from across the United States. The data was collected using a computer-driven robot operator system which introduced topics for discussion and recorded speech from both participants into separate channels. This release corrects errors from the original NIST publication (Release 1) and includes modifications to the NIST Sphere headers for consistency. The ISIP update of phonetic transcriptions developed by the International Computer Science Institute (ICSI) and corrected word alignments are available at ISIP. Researchers leverage this data for annotation projects, discourse analysis, part-of-speech tagging, and phonetic transcriptions. The corpus includes speaker attribution tables, updated file lists, and documentation.
Switchboard-1 Release 2 is a corpus of approximately 260 hours of conversational telephone speech developed by Texas Instruments and distributed by the Linguistic Data Consortium (LDC).
Explore all tools that specialize in speaker identification. This domain focus ensures Switchboard-1 Release 2 delivers optimized results for this specific requirement.
LDC audited and corrected speaker attributions to resolve problems identified by corpus users, ensuring accurate speaker metadata.
Modifications were made to the NIST Sphere headers of all speech files to identify each file as being part of the new release and to show the sample_count header field consistent with standard Sphere usage.
The corpus includes the ISIP update of the phonetic transcriptions developed by the International Computer Science Institute (ICSI) and corrected word alignments.
The Switchboard Dialog Act Corpus contains labels for conversations using a shallow discourse tagset based on the SWBD-DAMSL labels.
Corrections of known errors in the original publication of speech files.
Obtain LDC membership or purchase the corpus.
Download the corpus files from the LDC website.
Unpack the compressed files.
Read the corpus documentation to understand the file structure and data format.
Use a speech processing toolkit (e.g., Kaldi, HTK) to access and process the audio data.
Load the provided transcripts and annotations for analysis.
Correct Sphere header fields.
Utilize acoustic models for feature extraction.
All Set
Ready to go
Verified feedback from other users.
"Researchers praise Switchboard-1 Release 2 for its high-quality conversational speech data and its utility in training speech recognition and speaker identification systems."
Post questions, share tips, and help other users.

A preprint server for health sciences.

Connect your AI agents to the web with real-time search, extraction, and web crawling through a single, secure API.

STRING is a database of known and predicted protein-protein interactions.

A free and open-source software package for the analysis of brain imaging data sequences.

Complete statistical software for data science with powerful statistics, visualization, data manipulation, and automated reporting in one intuitive platform.

Star-convex object detection for 2D and 3D images.