We released a dataset of one million user-created playlists from the Spotify platform, dubbed the Million Playlist Dataset (MPD). The dataset includes, for each playlist, its title as well as the list of tracks (including album and artist names), and some additional metadata such as Spotify URIs and the playlist's number of followers. The dataset has a size of 5,39 GB and contains 1,000,000 playlists, including playlist titles and track titles, created by users on the Spotify platform between January 2010 and October 2017. It is ideal for building and evaluating recommendation algorithms, studying user behavior in music consumption, and understanding how playlists evolve over time. The dataset is widely used by researchers and developers to improve machine learning models for music streaming applications, ensuring a more personalized and engaging experience for users.
Publications Citing This Dataset:
Ching-Wei Chen, Paul Lamere, Markus Schedl, and Hamed Zamani. 2018. Recsys challenge 2018: automatic music playlist continuation. In Proceedings of the 12th ACM Conference on Recommender Systems (RecSys '18). Association for Computing Machinery, New York, NY, USA, 527–528.
https://doi.org/10.1145/3240323.3240342 Murciego, A.L., Jiménez-Bravo, D.M., Sales Mendes, A., López Baptista, V.F., Moreno-García, M.N. (2023). Auto-Tagger of Contextual Activity Tags for Music Tracks. In: de la Iglesia, D.H., de Paz Santana, J.F., López Rivero, A.J. (eds) New Trends in Disruptive Technologies, Tech Ethics and Artificial Intelligence. DiTTEt 2022. Advances in Intelligent Systems and Computing, vol 1430. Springer, Cham.
https://doi.org/10.1007/978-3-031-14859-0_10 Gabbolini, G., Bridge, D. (2023). Predicting the Listening Contexts of Music Playlists Using Knowledge Graphs. In: , et al. Advances in Information Retrieval. ECIR 2023. Lecture Notes in Computer Science, vol 13980. Springer, Cham.
https://doi.org/10.1007/978-3-031-28244-7_21 Schedl, M., Knees, P., McFee, B., Bogdanov, D. (2022). Music Recommendation Systems: Techniques, Use Cases, and Challenges. In: Ricci, F., Rokach, L., Shapira, B. (eds) Recommender Systems Handbook. Springer, New York, NY.
https://doi.org/10.1007/978-1-0716-2197-4_24 Sunitha, M., Adilakshmi, T., Unissa, M. (2022). Hybrid Deep Learning-Based Music Recommendation System. In: Pandian, A.P., Fernando, X., Haoxiang, W. (eds) Computer Networks, Big Data and IoT. Lecture Notes on Data Engineering and Communications Technologies, vol 117. Springer, Singapore.
https://doi.org/10.1007/978-981-19-0898-9_41 Giovanni Gabbolini and Derek Bridge. 2022. A User-Centered Investigation of Personal Music Tours. In Proceedings of the 16th ACM Conference on Recommender Systems (RecSys '22). Association for Computing Machinery, New York, NY, USA, 25–34. https://doi.org/10.1145/3523227.3546776
Variables:
Name
Description
name
name
collaborative
collaborative, e.g. "false"
pid
pid
modified_at
modified_at
num_albums
num_albums
num_tracks
num_tracks
num_followers
num_followers
num_edits
num_edits
duration_ms
duration_ms
num_artists
num_artists
pos
pos (for tracks)
artist_name
artist_name (for tracks)
track_uri
track_uri (for tracks)
artist_uri
artist_uri (for tracks)
track_name
track_name (for tracks)
album_uri
album_uri (for tracks)
duration_ms
duration_ms (for tracks)
album_name
album_name (for tracks)
Details:
Publisher:
Association for Computing Machinery (ACM); New York, NY, United States
To provide the best experiences, we use technologies like cookies to store and/or access device information. Consenting to these technologies will allow us to process data such as browsing behavior or unique IDs on this site. Not consenting or withdrawing consent, may adversely affect certain features and functions.
Functional
Always active
The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
Preferences
The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
Statistics
The technical storage or access that is used exclusively for statistical purposes.The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
Marketing
The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.