Search Books

Dealing with the Music of the World: Indexing Content-Based Music Similarity Models for Fast Retrieval in Massive Databases

Author Dr. Dominik Schnitzer
Publisher CreateSpace Independent Publishing Platform
📄 Viewing lite version Full site ›
🌎 Shop on Amazon — choose country
4.50 5.00 USD
🛒 Buy New on Amazon 🇺🇸 🏷 Buy Used — $51.36

✓ Usually ships in 24 hours

Share:
Book Details
ISBN / ASIN1477494154
ISBN-139781477494158
AvailabilityUsually ships in 24 hours
Sales Rank6,989,377
MarketplaceUnited States 🇺🇸

Description

This thesis shows how to develop an automatic, large-scale music recommendation system. To achieve this goal we solve three problems preventing the currently top-performing class of content-based music similarity algorithms from being used as recommendation engine in huge databases with millions of songs:

  • First, we show how to correctly use the non-vectorial music similarity features with their non-metric divergences in centroid-computing algorithms. All previous approaches had to artificially vectorize the data before they were able to work with the features.
  • Second, we show how the problem of 'hubs' can be alleviated. Hubs are objects in a recommendation system which are unwontedly often retrieved as nearest neighbors. The examined music recommendation methods are especially prone to hubs, significantly decreasing their retrieval quality. We also identify hubs as a problem of machine learning and show the beneficial effects of our method on a large number of general public machine learning collections.
  • Third, we present a new method to speed up music recommendation queries. The method uses a filter-and-refine systems layout. It achieves a very high retrieval accuracy and speeds up queries by a factor of 10--40 compared to a linear scan. The method enables us to use the music similarity methods with very large databases.

We finally merge all three introduced methods in a large-scale, high-quality music recommendation prototype: the system computes (i) a natural clustering of the music similarity features to (ii) apply the introduced hub-reducing method and (iii) use the filter-and-refine method to allow for fast retrieval. The prototype is called 'Wolperdinger', it operates on a collection of 2.3 million songs and it is able to answer recommendation queries in a fraction of a second. It is the largest content-based music recommendation system published to date.