<< Back
Last.fm Dataset - 360K users
DOWNLOAD lastfm-dataset-360K.tar.gz (~543Mb)
====== README ====== Version 1.2, March 2010 . What is this? This dataset contains <user, artist, plays> tuples (for ~360,000 users) collected from Last.fm API, using the user.getTopArtists() method. . Files: usersha1-artmbid-artname-plays.tsv (MD5: be672526eb7c69495c27ad27803148f1) usersha1-profile.tsv (MD5: 51159d4edf6a92cb96f87768aa2be678) mbox_sha1sum.py (MD5: feb3485eace85f3ba62e324839e6ab39) . Data Statistics: File usersha1-artmbid-artname-plays.tsv: Total Lines: 17,559,530 Unique Users: 359,347 Artists with MBID: 186,642 Artists without MBID: 107,373 . Data Format: The data is formatted one entry per line as follows (tab separated "\t"): File usersha1-artmbid-artname-plays.tsv: user-mboxsha1 \t musicbrainz-artist-id \t artist-name \t plays File usersha1-profile.tsv: user-mboxsha1 \t gender (m|f|empty) \t age (int|empty) \t country (str|empty) \t signup (date|empty) . Example: usersha1-artmbid-artname-plays.tsv: 000063d3fe1cf2ba248b9e3c3f0334845a27a6be \t a3cb23fc-acd3-4ce0-8f36-1e5aa6a18432 \t u2 \t 31 ... usersha1-profile.tsv 000063d3fe1cf2ba248b9e3c3f0334845a27a6be \t m \t 19 \t Mexico \t Apr 28, 2008 ... . License: The data contained in lastfm-dataset-360K.tar.gz is distributed with permission of Last.fm. The data is made available for non-commercial use. Those interested in using the data or web services in a commercial context should contact: partners [at] last [dot] fm For more information see Last.fm terms of service . Acknowledgements: Thanks to Last.fm for providing the access to this data via their web services. Special thanks to Norman Casagrande. . References: When using this dataset you must reference the Last.fm webpage. Optionally (not mandatory at all!), you can cite Chapter 3 of this book @book{Celma:Springer2010, author = {Celma, O.}, title = {{Music Recommendation and Discovery in the Long Tail}}, publisher = {Springer}, year = {2010} } . Contact: This data was collected by Òscar Celma @ MTG/UPF, during Fall 2008 and cleaned sometime during 2009