Pablo Barberá

Computational Political Scientist



Google Scholar

Publons profile

Download my CV

My academic research develops new methods for network and text analysis that improve our understanding of how exposure to political information through social media sites affects political behavior. My main methodological contribution is a set of tools to collect and extract value from data collected from social media sites, which give researchers the ability to measure key variables of interest in Political Science, such as ideology, issue salience, and the structure of communication networks. My work applies these methods to advance our knowledge about my main substantive interest – how the use of social networking platforms affects different aspects of democratic politics such as political polarization or civic health.


Who Leads? Who Follows? Measuring Issue Attention and Agenda Setting by Legislators and the Mass Public Using Social Media Data.

American Political Science Review, forthcoming.
Co-authored with Andreu Casas, Jonathan Nagler, Patrick J. Egan, Richard Bonneau, John T. Jost, and Joshua Tucker

Link | Supplementary Materials | Topic visualization demo | Replication code | Expand abstract »

Are legislators responsive to the priorities of the public? Research demonstrates a strong correspondence between the issues about which the public cares and the issues addressed by politicians, but conclusive evidence about who leads whom in setting the political agenda has yet to be uncovered. We answer this question with fine-grained temporal analyses of Twitter messages by legislators and the public during the 113th US Congress. After employing an unsupervised method that classifies tweets sent by legislators and citizens into topics, we use vector autoregression models to explore whose priorities more strongly predict the relationship between citizens and politicians. We find that legislators are more likely to follow, than to lead, discussion of public issues, results that hold even after controlling for the agenda-setting effects of the media. We also find, however, that legislators are more likely to be responsive to their supporters than to the general public.

How to Use Social Media Data for Political Science Research.

Co-authored with Zachary C. Steinert-Threlkeld.
Forthcoming in Curini, L., and Franzese, R. (eds) The SAGE Handbook of Research Methods in Political Science and International Relations, London: Sage, 2019

Preprint | Expand abstract »

Citizens across the globe spend an increasing proportion of their daily lives on social media platforms. Their activities on the sites generate granular, time-stamped footprints of human behavior and personal interactions. This chapter offers an overview of existing research that uses social media data in the fields of Political Science and International Relations. We discuss two types of studies: those where social media is being used as a source of data to study, e.g. political networks or public opinion, and those focusing on how social media transforms different political phenomena, ranging from ideological polarization and misinformation to election campaigns. Our review also offers an in-depth analysis of the main challenges of this type of data, such as the different sources of bias that limit the generalizability of findings, the difficulty of connecting online and offline behavior, and concerns about reproducibility and ethics. To illustrate the opportunities and limitations of social media data, we also provide an applied example using Twitter as event data to study the dynamics of protest movements in Egypt and Bahrain in 2011.

The Consequences of Exposure to Disinformation and Propaganda in Online Settings

Chapter in joint report sponsored by the Hewlett Foundation and edited by Joshua Tucker, 2018.

Link | Expand abstract »

The Hewlett Foundation commissioned this report to provide an overview of the current state of the literature on the relationship between social media; political polarization; and political "disinformation," a term used to encompass a wide range of types of information about politics found online, including "fake news," rumors, deliberately factually incorrect information, inadvertently factually incorrect information, politically slanted information, and "hyperpartisan" news. The full list of authors is: Joshua A. Tucker, Andrew Guess, Pablo Barberá, Cristian Vaccari, Alexandra Siegel, Sergey Sanovich, Denis Stukal, and Brendan Nyhan

How Social Media Facilitates Political Protest: Information, Motivation, and Social Networks

Political Psychology, 2018.
Co-authored with John T. Jost, Richard Bonneau, Melanie Langer, Megan Metzger, Jonathan Nagler, Joanna Sterling, and Joshua A. Tucker

Link | Expand abstract »

It is often claimed that social media platforms such as Facebook and Twitter are profoundly shaping political participation, especially when it comes to protest behavior. Whether or not this is the case, the analysis of “Big Data” generated by social media usage offers unprecedented opportunities to observe complex, dynamic effects associated with large-scale collective action and social movements. In this article, we summarize evidence from studies of protest movements in the United States, Spain, Turkey, and Ukraine demonstrating that: (1) Social media platforms facilitate the exchange of information that is vital to the coordination of protest activities, such as news about transportation, turnout, police presence, violence, medical services, and legal support; (2) in addition, social media platforms facilitate the exchange of emotional and motivational contents in support of and opposition to protest activity, including messages emphasizing anger, social identification, group efficacy, and concerns about fairness, justice, and deprivation as well as explicitly ideological themes; and (3) structural characteristics of online social networks, which may differ as a function of political ideology, have important implications for information exposure and the success or failure of organizational efforts. Next, we issue a brief call for future research on a topic that is understudied but fundamental to appreciating the role of social media in facilitating political participation, namely friendship. In closing, we liken the situation confronted by researchers who are harvesting vast quantities of social media data to that of systems biologists in the early days of genome sequencing.

From Liberation to Turmoil: Social Media and Democracy

Journal of Democracy, 2017.
Co-authored with Joshua Tucker, Yannis Theocharis, and Margaret Roberts.

Link | Expand abstract »

How can one technology—social media—simultaneously give rise to hopes for liberation in authoritarian regimes, be used for repression by these same regimes, and be harnessed by antisystem actors in democracy? We present a simple framework for reconciling these contradictory developments based on two propositions: 1) that social media give voice to those previously excluded from political discussion by traditional media, and 2) that although social media democratize access to information, the platforms themselves are neither inherently democratic nor nondemocratic, but represent a tool political actors can use for a variety of goals, including, paradoxically, illiberal goals.

The New Public Address System: Why Do World Leaders Adopt Social Media?

International Studies Quarterly, 2017.
Co-authored with Thomas Zeitzoff.

Link | Preprint | Online Appendix | Replication materials | Expand abstract »

The emergence of social media - and in particular Twitter and Facebook - has led scholars to focus on its effects on mass behavior and protest. Yet an important, and unanswered question is what explains the variation in the adoption and use of social media by world leaders? By the end of 2014, over 76% of world leaders had an active presence on social media, and used their accounts to communicate with domestic and international audiences. We look at several different potential hypotheses that explain adoption of social media by world leaders including: modernization, social pressure, level of democratization, and diffusion. We find strong support that increased political pressure from social unrest and higher levels of democratization are both associated with leader adoption of social media platforms. Although the association we identify is not causal, these findings reveal the relationship between institutional and political pressures and the political communication of country leaders.

A Bad Workman Blames His Tweets. The Consequences of Citizens' Uncivil Twitter Use when Interacting with Party Candidates

Journal of Communication, 2016.
Co-authored with Yannis Theocharis, Zoltán Fazekas, Sebastian Adrian Popa, and Olivier Parnet.

Link | Online appendix | Replication materials | Expand abstract »

Existing studies focusing on politicians' adoption of Twitter have found that they use it primarily as a broadcasting tool. We argue that citizens' impolite and/or uncivil behavior is one possible explanation for such decisions. Social media conversations are rife with harassment and politicians are a prime target. This alters the incentive structure of engaging in dialogue on social media. We use Spanish, Greek, German, and U.K. candidates' tweets sent during the run-up to the recent European Parliament elections, and rely on automated text analysis and machine learning methods to measure their level of civility. Our contribution is an actor-oriented theory of political dialogue that incorporates Twitter's specific affordances, clarifying how and why Twitter's democratic promise may be limited.

Is the Left-Right Scale a Valid Measure of Ideology? Individual-Level Variation in Associations with 'Left' and 'Right' and Left-Right Self-Placement

Political Behavior, 2016.
Co-authored with Paul Bauer, Kathrin Ackermann and Aaron Venetz.

Link | Preprint | Replication code | Expand abstract »

In order to measure ideology, political scientists heavily rely on the so-called left-right scale. Left and right are, however, abstract political concepts and may trigger different associations among respondents. If these associations vary systematically with other variables this may induce bias in the empirical study of ideology. We illustrate this problem using a unique survey that asked respondents open-ended questions regarding the meanings they attribute to the concepts "left" and "right". We assess and categorize this textual data using topic modeling techniques. Our analysis shows that variation in respondents' associations is systematically related to their self-placement on the left-right scale and also to variables such as education and respondents' cultural background (East vs. West Germany). Our findings indicate that the interpersonal comparability of the left-right scale across individuals is impaired. More generally, our study suggests that we need more research on how respondents interpret various abstract concepts that we regularly use in survey questions.

Of echo chambers and contrarian clubs: Exposure to political disagreement among German and Italian users of Twitter

Social Media + Society, 2016, 2 (3).
Co-authored with Cristian Vaccari, Augusto Valeriani, Richard Bonneau, John T. Jost, Jonathan Nagler, and Joshua Tucker.

Preprint | Expand abstract »

Scholars have debated whether social media platforms, by allowing users to select the information they are exposed to, may lead people to isolate themselves from viewpoints they disagree with, thereby serving as political “echo chambers.” We investigate hypotheses concerning the circumstances under which Twitter users who communicate about elections would engage with (a) supportive, (b) oppositional, and (c) mixed political networks. Based on online surveys of representative samples of Italian and German individuals who posted at least one Twitter message aboutelections in 2013, we find substantial differences in the extent to which social media facilitates exposure to similar vs. dissimilar political views. Our results suggest that exposure to supportive, oppositional, or mixed political networks on social media can be explained by broader patterns of political conversation (i.e. structure of offline networks) and specific habits in the political use of social media (i.e.the intensity of political discussion). These findings suggest that disagreement persists on social media even when ideological homophily is the modal outcome, and that scholarsshould pay more attention to specific situational and dispositional factors when evaluating the implications of social media for political communication.

Big data, social media, and protest: foundations for a research agenda

Chapter in "Computational Social Science", edited by Michael Alvarez, Cambridge University Press, 2016.
Co-authored with Joshua Tucker, Jonathan Nagler, Megan Metzger, Duncan Penfold-Brown, and Richard Bonneau.


The Critical Periphery in the Growth of Social Protests

PLOS ONE, 2015, 10 (11).
Co-authored with Ning Wang, Richard Bonneau, John T. Jost, Jonathan Nagler, Joshua Tucker and Sandra González-Bailón

Link | Online appendix | Replication data | Expand abstract »

Social media have provided instrumental means of communication in many recent political protests. The efficiency of online networks in disseminating timely information has been praised by many commentators; at the same time, users are often derided as “slacktivists” because of the shallow commitment involved in clicking a forwarding button. Here we consider the role of these peripheral online participants, the immense majority of users who surround the small epicenter of protests, representing layers of diminishing online activity around the committed minority. We analyze three datasets tracking protest communication in different languages and political contexts through the social media platform Twitter and employ a network decomposition technique to examine their hierarchical structure. We provide consistent evidence that peripheral participants are critical in increasing the reach of protest messages and generating online content at levels that are comparable to core participants. Although committed minorities may constitute the heart of protest movements, our results suggest that their success in maximizing the number of online citizens exposed to protest messages depends, at least in part, on activating the critical periphery. Peripheral users are less active on a per capita basis, but their power lies in their numbers: their aggregate contribution to the spread of protest messages is comparable in magnitude to that of core participants. An analysis of two other datasets unrelated to mass protests strengthens our interpretation that core-periphery dynamics are characteristically important in the context of collective action events. Theoretical models of diffusion in social networks would benefit from increased attention to the role of peripheral nodes in the propagation of information and behavior.

Tweeting from Left to Right: Is Online Political Communication More Than an Echo Chamber?

Psychological Science, 2015, 26 (10), 1531-1542.
Co-authored with John T. Jost, Jonathan Nagler, Joshua Tucker, and Richard Bonneau.

Link | Online appendix | Replication materials and data | Expand abstract »

We estimated ideological preferences of 3.8 million Twitter users and, using a dataset of 150 million tweets concerning 12 political and non-political issues, explored whether online communication resembles an "echo chamber" due to selective exposure and ideological segregation or a "national conversation." We observed that information was exchanged primarily among individuals with similar ideological preferences for political issues (e.g., presidential election, government shutdown) but not for many other current events (e.g., Boston marathon bombing, Super Bowl). Discussion of the Newtown shootings in 2012 reflected a dynamic process, beginning as a "national conversation" before being transformed into a polarized exchange. With respect to political and non-political issues, liberals were more likely than conservatives to engage in cross-ideological dissemination, highlighting an important asymmetry with respect to the structure of communication that is consistent with psychological theory and research. We conclude that previous work may have overestimated the degree of ideological segregation in social media usage.

Birds of the Same Feather Tweet Together. Bayesian Ideal Point Estimation Using Twitter Data.

Political Analysis, 2015, 23 (1), 76-91

Link | Pre-print | Online appendix | Replication materials | GitHub tutorial | Expand abstract »

Politicians and citizens increasingly engage in political conversations on social media outlets such as Twitter. In this paper I show that the structure of the social networks in which they are embedded can be a source of information about their ideological positions. Under the assumption that social networks are homophilic, I develop a Bayesian Spatial Following model that considers ideology as a latent variable, whose value can be inferred by examining which politics actors each user is following. This method allows us to estimate ideology for more actors than any existing alternative, at any point in time and across many polities. I apply this method to estimate ideal points for a large sample of both elite and mass public Twitter users in the US and five European countries. Thee estimated positions of legislators and political parties replicate conventional measures of ideology. The method is also able to successfully classify individuals who state their political preferences publicly and a sample of users matched with their party registration records. To illustrate the potential contribution of these estimates, I examine the extent to which online behavior during the 2012 US presidential election campaign is clustered along ideological lines.

Political Expression and Action on Social Media: Exploring the Relationship Between Lower- and Higher-Threshold Political Activities Among Twitter Users in Italy

Journal of Computer-Mediated Communication, 2015.
Co-authored with Cristian Vaccari, Augusto Valeriani, Richard Bonneau, John T. Jost, Jonathan Nagler, and Joshua Tucker.

Link | Expand abstract »

Scholars and commentators have debated whether lower-threshold forms of political engagement on social media should be treated as being conducive to higher-threshold modes of political participation or a diversion from them. Drawing on an original survey of a representative sample of Italians who discussed the 2013 election on Twitter, we demonstrate that the more respondents acquire political information via social media and express themselves politically on these platforms, the more they are likely to contact politicians via e-mail, campaign for parties and candidates using social media, and attend offline events to which they were invited online. These results suggest that lower-threshold forms of political engagement on social media do not distract from higher-threshold activities, but are strongly associated with them.

Understanding the political representativeness of Twitter users.

Social Science Computer Review, 2015, 33 (6), 712-729.
Co-authored with Gonzalo Rivero.

Link | Pre-print | Expand abstract »

In this article we analyze the structure and content of the political conversations that took place through the micro-blogging platform Twitter in the context of the 2011 Spanish legislative elections and the 2012 US presidential elections. Using a unique database of nearly 70 million tweets collected during both election campaigns, we find that Twitter replicates most of the existing inequalities in public political exchanges. Twitter users who write about politics tend to be male, to live in urban areas, and to have extreme ideological preferences. Our results have important implications for future research on the relationship between social media and politics, since they highlight the need to correct for potential biases derived from these sources of inequality.

Rooting out corruption or rooting for corruption? The Heterogenous Electoral Consequences of Scandals

Political Science Research and Methods, 2016, 4 (2), 379-397.
Co-authored with Pablo Fernández-Vázquez and Gonzalo Rivero.

Link | Pre-print | Replication materials | Expand abstract »

Corruption scandals have been found to have significant but mild electoral effects in the comparative literature (Golden, 2006). However, most studies have assumed that voters punish all kinds of illegal practices. This article challenges this assumption by distinguishing between two types of corruption, according to the type of welfare consequences they have for the constituency. This hypothesis is tested using data from the 2011 Spanish local elections. We exploit the abundance of corruption allegations associated with the Spanish housing boom, which generated income gains for a wide segment of the electorate in the short-term. We find that voters ignore corruption when there are side benefits to it, and that punishment is only administered in those cases in which they do not receive compensation.

Social Media and Political Communication: A survey of Twitter users during the 2013 Italian general election

Italian Political Science Review, 2013.
Co-authored with Cristian Vaccari, Augusto Valeriani, Richard Bonneau, John T. Jost, Jonathan Nagler, and Joshua Tucker.

Link | Expand abstract »

Social media have become increasingly relevant in election campaigns, as both politicians and citizens have integrated them into their communication repertoires. However, little is known about which types of citizens employ these tools to discuss politics and stay informed about current affairs and how they integrate the contents and connections they encounter online with their offline repertoires of political action. In order to address these questions, we devised an innovative online survey involving a random sample representative of Italians who communicated about the 2013 general election on Twitter. Our results show that Twitter political users in Italy are disproportionately male, younger, better educated, more interested in politics, and ideologically more left-wing than the population as a whole. Moreover, there is a strong correlation between online and offline political communication, and Twitter users often relay the political contents they encounter on the web in their face-to-face conversations. Although the political users of social media are not representative of the population, their greater propensity to engage in political conversations both online and offline make them important channels of personal communication and allow the contents that circulate on the web to diffuse among populations that are much broader than those that engage with social media. The electoral significance of these digital platforms thus reaches well beyond the immediate audiences that are exposed to political contents through them.

The electoral consequences of corruption scandals in Spain

Crime, Law and Social Change, 2013.
Co-authored with Pedro Riera, Raúl Gómez, Juan Antonio Mayoral, and José Ramón Montero.

Link | Expand abstract »

Previous studies of the electoral consequences of corruption in Spanish local elections (Jiménez Revista de Investigaciones Políticas y Sociológicas, 6(2):43–76, 2007; Fernández-Vázquez and Rivero 2011, Consecuencias electorales de la corrupción, 2003–2007. Estudios de Progreso, Fundación Alternativas; Costas et al. European Journal of Political Economy: 28(4):469-484, 2012) have found that voters do not necessarily punish corrupt mayors. As has been pointed out in the comparative literature, the average loss of electoral support by corrupt incumbents is small and does not prevent their reelection most of the times (Jiménez and Caínzos 2006, How far and why do corruption scandals cost votes? In Garrard, J. and Newell, J. (eds.) Scandals in past and contemporary politics. Manchester: Manchester University Press). What remains unsolved, however, is the remarkable variability in this pattern. This article explores some of the micro-level variables that may mediate the effect of corruption scandal on the votes. We focus on three factors: ideological closeness to the incumbent party, political sophistication, and employment status. Our results provide only partial support for our hypotheses, suggesting that the effects of corruption are much more complex than what may seem at first sight.

Los electores ante la corrupción

Informe sobre la democracia en España 2012, Fundación Alternativas, 2012.
Co-authored with Pablo Fernández-Vázquez.


Voting for Parties or for Candidates? The Trade-Off Between Party and Personal Representation in Spanish Regional and Local Elections

Revista Española de Investigaciones Sociológicas, 2010.

Link | Expand abstract »

When voters cast their ballot, are they choosing a candidate or a party? Electoral systems have a significant impact on how this question is answered in each country. As previous literature has shown, some electoral rules foster a more personal representation, while others strengthen the intermediary role of parties. In this paper I maintain that there exists a trade-off between these two types of representation. To empirically verify its existence and how it works, I have chosen local and regional elections in Spain as a case study. Given that they take place simultaneously under similar electoral systems, they can be considered a natural experiment for the study of this trade-off, which allows me to overcome the potential problems of endogeneity present in previous studies. By measuring the significance of ideological closeness and candidate evaluations in voters’ decisions at each level, it is shown that the importance of personal representation increases in local elections at the expense of a less frequent use of ideological proximity as an informational shortcut, thus confirming the existence of the trade-off.

Naturaleza e influencia de los think tanks en el proceso político en España

Working Papers 292, Institut de Ciències Polítiques i Socials, 2010.
Co-authored with Javier Arregui.


Anàlisi de la realitat socioestructural de les Comunitats Autònomes

Chapter in Gallego, R. and Subirats, J. (eds.) Autonomies i desigualtats a Espanya: Percepcions, evolució social i polítiques de benestar. Institut d'Estudis Autonòmics, 2010.
Co-authored with Clara Riba.


Work in progress

Less is More? How Demographic Sample Weights can Improve Public Opinion Estimates Based on Twitter Data.

Working paper, April 2016 | Expand abstract »

An important limitation in previous studies of political behavior using Twitter data is the lack of information about the sociodemographic characteristics of individual users. This paper addresses this challenge by developing new machine learning methods that will allow researchers to estimate the age, gender, race, party affiliation, propensity to vote, and income of any Twitter user in the U.S. with high accuracy. The training dataset for these classifiers was obtained by matching a massive dataset of 1 billion geolocated Twitter messages with voting registration records and estimates of home values across 15 different states, resulting in a sample of nearly 250,000 Twitter users whose sociodemographic traits are known. I illustrate the value of these new methods with two applications. First, I explore how attention to different candidates in the 2016 presidential primary election varies across demographic groups within a panel of randomly selected Twitter users. I argue that these covariates can be used to adjust estimates of sentiment towards political actors based on Twitter data, and provide a proof of concept using presidential approval. Second, I examine whether social media can reduce inequalities in potential exposure to political messages. In particular, I show that retweets (a proxy for inadvertent exposure) have a large equalizing effect in access to information.

How Social Media Reduces Mass Political Polarization. Evidence from Germany, Spain, and the United States

Working paper, August 2015 | Expand abstract »

A growing proportion of citizens rely on social media to gather political information and to engage in political discussions within their personal networks. Existing studies argue that social media create “echo-chambers,” where individuals are primarily exposed to like-minded views. However, this literature has ignored that social media platforms facilitate exposure to messages from those with whom individuals have weak ties, which are more likely to provide novel information to which individuals would not be exposed otherwise through offline interactions. Because weak ties tend to be with people who are more politically heterogeneous than citizens' immediate personal networks, this exposure reduces political extremism. To test this hypothesis, I develop a new method to estimate dynamic ideal points for social media users. I apply this method to measure the ideological positions of millions of individuals in Germany, Spain, and the United States over time, as well as the ideological composition of their personal networks. Results from this panel design show that most social media users are embedded in ideologically diverse networks, and that exposure to political diversity has a positive effect on political moderation. This result is robust to the inclusion of covariates measuring offline political behavior, obtained by matching Twitter user profiles with publicly available voter files in several U.S. states. I also provide evidence from survey data in these three countries that bolsters these findings. Contrary to conventional wisdom, my analysis provides evidence that social media usage reduces mass political polarization.
Media coverage: New York Times, Wall Street Journal, Nieman Lab, Wired UK, Slate FR, Le Monde

Prospects of Ideological Realignment(s) in the 2014 EP elections? Analyzing the Common Multidimensional Political Space for Voters, Parties, and Legislators in Europe

Co-authored with Sebastian Adrian Popa and Hermann Schmitt

Working paper, April 2015 | Expand abstract »

Given the current economic and political crisis in Europe, many argue that the 2014 EP elections shifted the electoral competition from national politics to a debate about the extent and scope of the EU level of governance. We contribute to this discussion by analyzing the European ideological space at the time of the elections. We build upon existing scaling techniques applied to social media networks and develop a new method to measure the positions of political parties and individual legislators in a multidimensional political space. We apply this method to estimate the ideological positions of candidates to the European Parliament and the sitting MPs in all 28 EU states, relying on a new dataset of social media accounts. To validate our estimates, we compare them with the aggregate perceptions of parties’ positions from the Voter Study of the European Election Study 2014. Our final goal is to analyze to what extent the 2014 EP elections brought the expected changes. We achieve this by establishing the relative importance of the left-right and European integration dimensions in each country. We also examine if and why the position of parties and candidates in EP elections differs from the position of parties and legislators in national parliaments.

Automated Text Classification of News Articles: A Practical Guide.

Co-authored with Amber Boydstun, Suzanne Linn, and Jonathan Nagler

Working paper, August 2019 | Expand abstract »

Automated text analysis methods have made possible the classification of large corpora of text by measures such as topic and tone. Here, we provide a guide to help researchers navigate the consequential decisions they need to make before any measure can be produced from the text. We consider, both theoretically and empirically, the effects of such choices using as a running example efforts to measure the tone of New York Times coverage of the economy. We show that two reasonable approaches to corpus selection yield radically different corpora and we advocate for the use of keyword searches rather than pre-defined subject categories provided by news archives. We demonstrate the benefits of coding using article-segments instead of sentences as units of analysis. We show that, given a fixed number of codings, it is better to increase the number of unique documents coded rather than the number of coders for each document. Finally, we find that supervised machine learning algorithms outperform dictionaries on a number of criteria. Overall, we intend this guide to serve as a reminder to analysts that thoughtfulness and human validation are key to text-as-data methods, particularly in an age when it is all-too-easy to computationally classify texts without attending to the methodological choices therein.