This blog summarizes the work that I have done in the field of imaging science.
Trained musicians transcribe music into written notation all the time, even for professionals, complex music can be an arduous task. Performing this task with a computer makes music transcription easier, because of this it is a challenging problem and topic of active research. Music is very complex, and preforming computational analysis with the accuracy of the human ear has yet to be accomplished.
The same image processing techniques that occur when analyzing satellite images, incidentally, can be used to analyze musical pieces. An image is a 3-dimensional representation of information at a scene: color, length, and width. Music however, is a a 2-dimensional representation of information which includes pitch and time. In order to make music analogous to images we could wrap the temporal dimension to synthesize another dimension.
Imagine drawing a line through each pixel starting at the top left and proceeding down in a backgrounds “S” motion. That is how I can create an image from a musical selection.
Now that I have music represented as an image I can use an image analysis tool called spectral unmixing. This algorithm determines the percent abundance of every possible material that could be present in each pixel. Applying this method to music tells us what instruments and which notes are played at each instant in time. Which would allow the musical selection to be transcribed by a computer. Currently, I have attempted to use the STFT, Short Time Fourier Transform, and Matlab to preform the analysis. In the future, other methods could be explored, possibly the Wavelet transform or ENVI, a hyper spectral image processing software.