Skip to content

Latest commit

 

History

History
130 lines (96 loc) · 6.33 KB

ABOUT.md

File metadata and controls

130 lines (96 loc) · 6.33 KB

xcorrSound

Improve Your Digital Audio Recordings

What is xcorrSound?

xcorrSound consists of three top level tools and an audio search engine component

  • overlap-analysis detects overlap in two audio files
  • waveform-compare compares two audio files and outputs the similarity
  • sound-match finds occurences of an audio clip in an audio file
  • ismir is a sound search engine.

Ismir consists of three tools

  • ismir_build_index, which creates an index file from a wav file
  • ismir_merge, which can merge two index files (to keep down the number of index files)
  • ismir_query, which takes a wav file and looks for it in the index files

What Can xcorrSound Do For Me?

The automisation of manual processes offers an important performance improvement. xcorrSound brings the following benefits:

  • precision in the overlap analysis
  • automated processes
  • resource efficiency
  • open source: freely available
  • easy to install and integrate into a workflow (command line tool)
  • leads to an improved and optimised end user experience

xcorrSound Can Be Used By

  • Institutions disseminating audio content
  • Institutions preserving audio collections

Examples

The State and University Library in Denmark holds a large collection of digitised audio recordings, originally recorded on two-hour tapes, with overlaps from tape to tape. To enhance the user experience, the library wanted to eliminate the overlaps and make the broadcast a continuous stream. This was done by using xcorrSound overlap-analysis.

In xcorrSound overlap-analysis, algorithms use cross correlation to compare the sound waves. With this an automated overlap analysis of the audio recordings was conducted. This enabled the library to cut and put together the resulting trimmed files in 24 hour blocks which enabled improvement of the end users' listening experience.

Algorithms

All the implemented algorithms rely on the Cross Correlation procedure.

Waveform compare

The input is two wav files of the same length (n), sample-rate, bit-rate and so on. The output is a real value between 0 and 1 indicating how similar the two files are (content-wise) where 0 indicates no similarity and 1 indicates they are identical.

The algorithm splits the two files f and g into blocks f_1, f_2, ..., f_{n/B} and g_1, g_2, ..., g_{n/B} all with the same length B. That is, the first block consists of the first B samples, the second block of the following B samples from the respective files, and so on. Then cross correlation is applied on all corresponding blocks, f_i and g_i. The peak value of the cross correlation tells how much to shift one block in time to achieve the best match value -- we call this the offset of the block. If there is a block where the offset is more than 500 samples away from the offset of the first block, then an error is reported and f and g are deemed different. Otherwise a the minimum match value among the blocks is reported as well as the offset in that block.

This algorithm has a low memory use that is proportional to the size of the blocks.

Overlap analysis

The input is two wav files such that the last part (unknown how much) of the first appears as the first part (also unknown how much) of the second file -- content-wise. The input is two wav files of the same length (n), sample-rate, bit-rate and so on. The output is a length and a real value between 0 and 1 indicating how good the match is.

The algorithm does one cross correlation computation of the two input files and outputs the peak position and value. This means the memory usage is proportional to the size of the input files which is quite memory intensive compared to the input. The use case for this tool is to find small overlaps e.g. a few minutes.

Ismir

The audio index tools were developed to search for a short sound bite (a jingle) in a large audio archive. The implementation is derived from the paper "A Highly Robust Audio Fingerprinting System", Jaap Haitsma og Ton Kalker, ISMIR 2002.

Publications

Leaflet

Conference paper

Blog posts

SlideShare

Vimeo

Components

Credits

  • This work was partially supported by the SCAPE project. The SCAPE project is co-funded by the European Union under FP7 ICT-2009.4.1 (Grant Agreement number 270137)
  • XCORRSOUND is copyright 2012 State and University Library, Denmark released under GPLv2, see ./COPYING or http://www.gnu.org/licenses/gpl-2.0.html