Skip to content

Literature Review

Aumit Leon edited this page Dec 8, 2017 · 5 revisions

In preparing our approach to the million song dataset, we looked into the literature related to the dataset.

One of the challenges in the field of Machine Learning as it relates to the analysis of music is in the lack of available data-- the following papers highlight different aspects of and approaches to this data.


The Million Song Dataset

When the dataset was originally released several years ago, this paper was released along with it. Before the release of this dataset, Music Information Retreival (MIR) was a difficult task primarily because of licensing and the proprietary nature of the music industry. This paper summarizes some of the approaches taken in the effort to compile this dataset as well as some of the experiments that researches can run-- some of these suggestions have informed our project trajectory. http://ismir2011.ismir.net/papers/OS6-1.pdf

Music Genre Classification with the Million Song Dataset

This paper catalogs an interesting approach to Genre-classification, and provides insight into the different features that might be useful for genre classification. In particular, they have concluded that timbre features work, bag-of-words lyric features work, and lyric data gives different information from audio data. Furthermore, they've also concluded that combining audio and lyric features proves to be an effective mechanism for genre classification. along with tempo and loudness. http://www.ee.columbia.edu/~dliang/files/FINAL.pdf

Deep Content Based Recommendations

http://papers.nips.cc/paper/5004-deep-content-based-music-recommendation.pdf