We released a dataset of one million user-created playlists from the Spotify platform, dubbed the Million Playlist Dataset (MPD). The dataset includes, for each playlist, its title as well as the list of tracks (including album and artist names), and some additional metadata such as Spotify URIs and the playlist's number of followers. The dataset has a size of 5,39 GB and contains 1,000,000 playlists, including playlist titles and track titles, created by users on the Spotify platform between January 2010 and October 2017. It is ideal for building and evaluating recommendation algorithms, studying user behavior in music consumption, and understanding how playlists evolve over time. The dataset is widely used by researchers and developers to improve machine learning models for music streaming applications, ensuring a more personalized and engaging experience for users.
Publications Citing This Dataset:
Ching-Wei Chen, Paul Lamere, Markus Schedl, and Hamed Zamani. 2018. Recsys challenge 2018: automatic music playlist continuation. In Proceedings of the 12th ACM Conference on Recommender Systems (RecSys '18). Association for Computing Machinery, New York, NY, USA, 527–528.
https://doi.org/10.1145/3240323.3240342 Murciego, A.L., Jiménez-Bravo, D.M., Sales Mendes, A., López Baptista, V.F., Moreno-García, M.N. (2023). Auto-Tagger of Contextual Activity Tags for Music Tracks. In: de la Iglesia, D.H., de Paz Santana, J.F., López Rivero, A.J. (eds) New Trends in Disruptive Technologies, Tech Ethics and Artificial Intelligence. DiTTEt 2022. Advances in Intelligent Systems and Computing, vol 1430. Springer, Cham.
https://doi.org/10.1007/978-3-031-14859-0_10 Gabbolini, G., Bridge, D. (2023). Predicting the Listening Contexts of Music Playlists Using Knowledge Graphs. In: , et al. Advances in Information Retrieval. ECIR 2023. Lecture Notes in Computer Science, vol 13980. Springer, Cham.
https://doi.org/10.1007/978-3-031-28244-7_21 Schedl, M., Knees, P., McFee, B., Bogdanov, D. (2022). Music Recommendation Systems: Techniques, Use Cases, and Challenges. In: Ricci, F., Rokach, L., Shapira, B. (eds) Recommender Systems Handbook. Springer, New York, NY.
https://doi.org/10.1007/978-1-0716-2197-4_24 Sunitha, M., Adilakshmi, T., Unissa, M. (2022). Hybrid Deep Learning-Based Music Recommendation System. In: Pandian, A.P., Fernando, X., Haoxiang, W. (eds) Computer Networks, Big Data and IoT. Lecture Notes on Data Engineering and Communications Technologies, vol 117. Springer, Singapore.
https://doi.org/10.1007/978-981-19-0898-9_41 Giovanni Gabbolini and Derek Bridge. 2022. A User-Centered Investigation of Personal Music Tours. In Proceedings of the 16th ACM Conference on Recommender Systems (RecSys '22). Association for Computing Machinery, New York, NY, USA, 25–34. https://doi.org/10.1145/3523227.3546776
Variables:
Name
Description
name
name
collaborative
collaborative, e.g. "false"
pid
pid
modified_at
modified_at
num_albums
num_albums
num_tracks
num_tracks
num_followers
num_followers
num_edits
num_edits
duration_ms
duration_ms
num_artists
num_artists
pos
pos (for tracks)
artist_name
artist_name (for tracks)
track_uri
track_uri (for tracks)
artist_uri
artist_uri (for tracks)
track_name
track_name (for tracks)
album_uri
album_uri (for tracks)
duration_ms
duration_ms (for tracks)
album_name
album_name (for tracks)
Details:
Publisher:
Association for Computing Machinery (ACM); New York, NY, United States