Twitter Dataset

This dataset is a collection of scraped public twitter updates used in coordination with an academic project to study the geolocation data related to twittering. We provide both training set and test set in the paper You Are Where You Tweet: A Content-Based Approach to Geo-locating Twitter Users in CIKM 2010. The training set contains 115,886 Twitter users and 3,844,612 updates from the users. All the locations of the users are self-labeled in United States in city-level granularity. The test set contains 5,136 Twitter users and 5,156,047 tweets from the users. All the locations of users are uploaded from their smart phones with the form of “UT: Latitude,Longitude”. 

Publication Date:
2010
Creators:
Cheng, Zhiyuan; Caverlee, James; Lee, Kyumin
Publisher:
Proceeding of the 19th ACM Conference on Information and Knowledge Management (CIKM), 2010
Companies:
Texas A&M University; Texas A&M University; Texas A&M University;
Size:
30.0 kB
Custom License:
Creative Commons Attribution Noncommercial 3.0 United States License
Data Category:
Social Media and Online Reviews
Countries:
worldwide

Sign In

Register

Reset Password

Please enter your username or email address, you will receive a link to create a new password via email.