Consider being at a party or gathering. How tedious it would be to choose what music to play in order to be fine-tuned for the attendees 👯. Should I go to each one, ask them, write down their music preference, and then search for a playlist?
That got me working on a music recommendation algorithm for large groups of people, using Spotify as a music provider.
- People attending a party or gathering are able to share music with the host.
- Party host can set specific preferences (e.g. preference for dance music)
- The algorithm using unsupervised ML techniques can identify distinct music tastes present in the group and provide relevant tracks to satisfy and get all attendees to enjoy their time.
- Web Scraping to create a large music artist graph (with the connections between artists).
- Node2Vec to create artist embeddings based on graphs.
- Auto-adjusted DB-SCAN.
- TSNE.
- k-NN with cosine similarity between track representation vectors.
⭕ 👤 → 🎶 Music recommendation for individuals has long been ongoing on major platforms such as Spotify and Youtube.
✅ 👥 → 🎶 The issue at hand is how to deal with recommending music based on a dynamic environment of people entering or leaving a group.
The foundation of the algorithm is the track representation vector.
Artist embeddings have been created by web scraping Wikipedia. Who other artists have they collaborated with, who they had a concert with, and others that are a simple reference in the same page.
In order to plot them in the 2D space, the TSNE dimensionality reduction algorithm is used.
Created Genre embeddings using sentence transformers with BERT LLM.
The profile can be identified by one or more vectors that will be later used for recommendations.
In order to do that, all the track representation vector dimensions are reduced with TSNE, and then all different music tastes are identified with the DBSCAN clustering algorithm.
Process:
- M clusters from DBSCAN
- Find the weighted centroid of the cluster based on each users track weight. (This is relevant to how many tracks each user submitted)
- The party/bar/club owner can set party settings to finetune results to specific preferences, e.g highly danceable or include only techno and deep house tracks.
- For each M profile there are K possible matches which will all be filtered based on settings, and the neighbors with the highest scores will be kept.
Because different versions of the same tracks exist with the same vectors, a curator has been created that operate as a DJ, that will keep the most relevant version of same tracks, and sort the tracks recommended based on energy and track bpm.
A system has been created with FastAPI that you can just execute and have the backend ready for your application.
There are all the necessary endpoints ready for consumption.