<< Back
Last.fm Dataset - 360K users
DOWNLOAD lastfm-dataset-360K.tar.gz (~543Mb)
======
README
======
Version 1.2, March 2010
. What is this?
This dataset contains <user, artist, plays> tuples (for ~360,000 users) collected from Last.fm API,
using the user.getTopArtists() method.
. Files:
usersha1-artmbid-artname-plays.tsv (MD5: be672526eb7c69495c27ad27803148f1)
usersha1-profile.tsv (MD5: 51159d4edf6a92cb96f87768aa2be678)
mbox_sha1sum.py (MD5: feb3485eace85f3ba62e324839e6ab39)
. Data Statistics:
File usersha1-artmbid-artname-plays.tsv:
Total Lines: 17,559,530
Unique Users: 359,347
Artists with MBID: 186,642
Artists without MBID: 107,373
. Data Format:
The data is formatted one entry per line as follows (tab separated "\t"):
File usersha1-artmbid-artname-plays.tsv:
user-mboxsha1 \t musicbrainz-artist-id \t artist-name \t plays
File usersha1-profile.tsv:
user-mboxsha1 \t gender (m|f|empty) \t age (int|empty) \t country (str|empty) \t signup (date|empty)
. Example:
usersha1-artmbid-artname-plays.tsv:
000063d3fe1cf2ba248b9e3c3f0334845a27a6be \t a3cb23fc-acd3-4ce0-8f36-1e5aa6a18432 \t u2 \t 31
...
usersha1-profile.tsv
000063d3fe1cf2ba248b9e3c3f0334845a27a6be \t m \t 19 \t Mexico \t Apr 28, 2008
...
. License:
The data contained in lastfm-dataset-360K.tar.gz is distributed with permission of Last.fm.
The data is made available for non-commercial use.
Those interested in using the data or web services in a commercial context should contact:
partners [at] last [dot] fm
For more information see Last.fm terms of service
. Acknowledgements:
Thanks to Last.fm for providing the access to this data via their web services.
Special thanks to Norman Casagrande.
. References:
When using this dataset you must reference the Last.fm webpage.
Optionally (not mandatory at all!), you can cite Chapter 3 of this book
@book{Celma:Springer2010,
author = {Celma, O.},
title = {{Music Recommendation and Discovery in the Long Tail}},
publisher = {Springer},
year = {2010}
}
. Contact:
This data was collected by Òscar Celma @ MTG/UPF, during Fall 2008 and cleaned sometime during 2009