Abstract
Social media analysis constitutes a scientific field that is rapidly gaining ground due to its numerous research challenges and practical applications, as well as the unprecedented availability of data in real time. Several of these applications have significant social and economical impact, such as journalism, crisis management, advertising, etc. However, two issues regarding these applications have to be confronted. The first one is the financial cost. Despite the abundance of information, it typically comes at a premium price, and only a fraction is provided free of charge. For example, Twitter, a predominant social media online service, grants researchers and practitioners free access to only a small proportion (1[%]) of its publicly available stream. The second issue is the computational cost. Even when the full stream is available, off the shelf approaches are unable to operate in such settings due to the real-time computational demands. Consequently, real world applications as well as research efforts that exploit such information are limited to utilizing only a subset of the available data. In this paper, we are interested in evaluating the extent to which analytical processes are affected by the aforementioned limitation. In particular, we apply a plethora of analysis processes on two subsets of Twitter public data, obtained through the service's sampling API's. The first one is the default 1[%] sample, whereas the second is the Garden hose sample that our research group has access to, returning 10[%] of all public data. We extensively evaluate their relative performance in numerous scenarios.
Original language | English |
---|---|
Title of host publication | Proceedings - 2014 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Workshops, WI-IAT 2014 |
Editors | Dominik Slezak, Hung Son Nguyen, Eugene Santos, Marek Reformat |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 103-109 |
Number of pages | 7 |
Volume | 1 |
ISBN (Electronic) | 9781479941438 |
DOIs | |
Publication status | Published - 16 Oct 2014 |
Externally published | Yes |
Event | 2014 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Workshops, WI-IAT 2014 - Warsaw, Poland Duration: 11 Aug 2014 → 14 Aug 2014 |
Conference
Conference | 2014 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Workshops, WI-IAT 2014 |
---|---|
Country/Territory | Poland |
City | Warsaw |
Period | 11/08/14 → 14/08/14 |