The following keywords have been assigned to this publication so far. If you have logged in,
you can tag this publication with additional keywords.
If you log in you can tag this publication with additional keywords
A publication can refer to another publication (outgoing references) or it can be referred to by other
publications (incoming references).
If you log in you can add references to other publications
A publication can be assigned to a conference, a journal or a school.
Activity and user engagement in social media such as web logs,
wikis, online forums or social networks has been increasing at
unprecedented rates. In relation to social behavior in various
human activities, user activity in social media indicates the
existence of individuals that consistently drive or stimulate
'discussions' in the online world. Such individuals are
considered as 'starters' of online discussions in contrast with
'followers' that primarily engage in discussions and follow
In this paper, we formalize notions of 'starters' and
'followers' in social media. Motivated by the challenging size
of the available information related to online social behavior,
we focus on the development of random sampling approaches
allowing us to achieve significant efficiency while identifying
starters and followers. In our experimental section we utilize
BlogScope, our social media warehousing platform under
development at the University of Toronto. We demonstrate the
scalability and accuracy of our sampling approaches using real
data establishing the practical utility of our techniques in a
real social media warehousing environment.