In this paper we analyze the distribution of interests in a large population of Twitter users (the full set of 40 million users in 2009 and a sample of about 100 thousand New York users in 2014), as a function of gender. To model interests, we associate "topical" friends in users' friendship lists (friends representing an interest rather than a social relation between peers) with Wikipedia categories. A word-sense disambiguation algorithm is used for selecting the appropriate wikipage for each topical friend. Starting from the set of wikipages representing the population's interests, we extract the sub-graph of Wikipedia categories connected to these pages, and we then prune cycles to induce a direct acyclic graph, that we call Twixonomy. We use a novel method for reducing the computational requirements of cycle detection on very large graphs. For any category at any generalization level in the Twixonomy, it is then possible to estimate the gender distribution of Twitter users interested in that category. We analyze both the population of "celebrities", i.e. male and female Twitter users with an associated wikipage, and the population of "peers", i.e. male and female users who follow celebrities.

What women like: A gendered analysis of twitter users' interests based on a twixonomy

Stilo Giovanni;
2015-01-01

Abstract

In this paper we analyze the distribution of interests in a large population of Twitter users (the full set of 40 million users in 2009 and a sample of about 100 thousand New York users in 2014), as a function of gender. To model interests, we associate "topical" friends in users' friendship lists (friends representing an interest rather than a social relation between peers) with Wikipedia categories. A word-sense disambiguation algorithm is used for selecting the appropriate wikipage for each topical friend. Starting from the set of wikipages representing the population's interests, we extract the sub-graph of Wikipedia categories connected to these pages, and we then prune cycles to induce a direct acyclic graph, that we call Twixonomy. We use a novel method for reducing the computational requirements of cycle detection on very large graphs. For any category at any generalization level in the Twixonomy, it is then possible to estimate the gender distribution of Twitter users interested in that category. We analyze both the population of "celebrities", i.e. male and female Twitter users with an associated wikipage, and the population of "peers", i.e. male and female users who follow celebrities.
2015
9781577357377
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11697/133269
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? ND
social impact