Improve Your Digital Audio Recordings
xcorrSound consists of three top level tools and an audio search engine component
- overlap-analysis detects overlap in two audio files
- waveform-compare compares two audio files and outputs the similarity
- sound-match finds occurences of an audio clip in an audio file
- ismir is a sound search engine.
Ismir consists of three tools
- ismir_build_index, which creates an index file from a wav file
- ismir_merge, which can merge two index files (to keep down the number of index files)
- ismir_query, which takes a wav file and looks for it in the index files
The automisation of manual processes offers an important performance improvement. xcorrSound brings the following benefits:
- precision in the overlap analysis
- automated processes
- resource efficiency
- open source: freely available
- easy to install and integrate into a workflow (command line tool)
- leads to an improved and optimised end user experience
- Institutions disseminating audio content
- Institutions preserving audio collections
The State and University Library in Denmark holds a large collection of digitised audio recordings, originally recorded on two-hour tapes, with overlaps from tape to tape. To enhance the user experience, the library wanted to eliminate the overlaps and make the broadcast a continuous stream. This was done by using xcorrSound overlap-analysis.
In xcorrSound overlap-analysis, algorithms use cross correlation to compare the sound waves. With this an automated overlap analysis of the audio recordings was conducted. This enabled the library to cut and put together the resulting trimmed files in 24 hour blocks which enabled improvement of the end users' listening experience.
All the implemented algorithms rely on the Cross Correlation procedure.
The input is two wav files of the same length (n), sample-rate, bit-rate and so on. The output is a real value between 0 and 1 indicating how similar the two files are (content-wise) where 0 indicates no similarity and 1 indicates they are identical.
The algorithm splits the two files f and g into blocks f_1, f_2, ..., f_{n/B} and g_1, g_2, ..., g_{n/B} all with the same length B. That is, the first block consists of the first B samples, the second block of the following B samples from the respective files, and so on. Then cross correlation is applied on all corresponding blocks, f_i and g_i. The peak value of the cross correlation tells how much to shift one block in time to achieve the best match value -- we call this the offset of the block. If there is a block where the offset is more than 500 samples away from the offset of the first block, then an error is reported and f and g are deemed different. Otherwise a the minimum match value among the blocks is reported as well as the offset in that block.
This algorithm has a low memory use that is proportional to the size of the blocks.
The input is two wav files such that the last part (unknown how much) of the first appears as the first part (also unknown how much) of the second file -- content-wise. The input is two wav files of the same length (n), sample-rate, bit-rate and so on. The output is a length and a real value between 0 and 1 indicating how good the match is.
The algorithm does one cross correlation computation of the two input files and outputs the peak position and value. This means the memory usage is proportional to the size of the input files which is quite memory intensive compared to the input. The use case for this tool is to find small overlaps e.g. a few minutes.
The audio index tools were developed to search for a short sound bite (a jingle) in a large audio archive. The implementation is derived from the paper "A Highly Robust Audio Fingerprinting System", Jaap Haitsma og Ton Kalker, ISMIR 2002.
- Bolette Ammitzbøll Jurik and Jesper Sindahl Nielsen: Audio Quality Assurance: An Application of Cross Correlation. In: iPRES 2012 Proceedings of the 9th International Conference on Preservation of Digital Objects. Toronto 2012, 144-149. ISBN 978-0-9917997-0-1
- xcorrSound: waveform-compare New Audio Quality Assurance Tool
- Sound Challenge: And the Easter Egg goes to ...
- Developing an Audio QA workflow using Hadoop: Part I
- Developing an Audio QA workflow using Hadoop: Part II
- Scape Demonstration: Migration of audio using xcorrSound
- Audio Quality Assurance. An application of cross correlation
- Migration of audio files using Hadoop - and Taverna - and xcorrSound waveform-compare
- This work was partially supported by the SCAPE project. The SCAPE project is co-funded by the European Union under FP7 ICT-2009.4.1 (Grant Agreement number 270137)
- XCORRSOUND is copyright 2012 State and University Library, Denmark released under GPLv2, see ./COPYING or http://www.gnu.org/licenses/gpl-2.0.html