Research on social media and journalism (2003-2017): a bibliometric and content review Pesquisa em mídia social e jornalismo (2003-2017) : uma análise bibliométrica e de conteúdo

This paper introduces a bibliometric review of the scientific literature on social media and journalism published by journals indexed by Journal Citation Reports until 2017 (n=213). Besides descriptive measurements, it provides a co-citation and co-word analysis. A quantitative content analysis complements the bibliometric approach. Thus, the paper offers a conceptual and structural analysis of the field of study. Results show that the number of articles on the topic is growing steadily since 2014. United States, Australia and England stand as the most productive countries. Studies are based mostly on data from Europe and North America. Three conceptual clusters are identified: audience participation, user generated content and the influence of social media on journalistic professional values and practices. Most of the studies did not consider specific services but focused on the general concept of “social media”. Twitter was the most analyzed platform until recent years, when scholarly attention changed towards Facebook. Research has preferred focusing on political information in detriment of other branches of journalism. The most employed methods are content analysis and in-depth interviews. Further use of surveys and social network analysis, as well as stronger focus on visual studies, is suggested.


Introduction
Social media have profoundly altered society and, specifically, the way citizens consume information and build their opinions (Schmidt et al., 2017). This change has deeply affected journalism and how news are constructed, distributed and financed (Nielsen;Ganter, 2018). Thus, journalism has become one of the most relevant topics in contemporary research on social media (Liu;Ho;Lu, 2017). Yet, scholarly production on this regard has not been approached through bibliometric analysis. This quantitative analysis of the bibliographic features of scientific documents (Díaz-Campo, 2016) allows to draw the evolution of a research field and to detect future research fronts (Santa Soriano;Lorenzo-Álvarez;Torres-Valdés, 2018). Similar reviews have been implemented for the whole scientific production on social media  yet, research on journalism and social media has not been systematically reviewed through bibliometric methods.
This article offers the first bibliometric analysis of research on social media and journalism. It expands the analysis of descriptive features such as number of documents and citations at the individual or national level (Gómez Crisóstomo; Romo-Fernández; Caldera-Serrano, 2017) with a co-citation and a co-word analysis to identify topics and emerging tendencies in the field (Galvez, 2018). This bibliometric approach is also complemented by a quantitative content analysis, which offers an additional and deeper scrutiny.
This study aims to provide a conceptual and structural review of social media and journalism as a field of study. To achieve this goal it proposes the following research questions: (A) RQ1. What trends and features characterize research on social media and journalism? RQ1.1. How has the number of articles published on the topic evolved? RQ1.2. What are the most important sources (journals)? RQ1.3. Who are the most productive researchers about this issue? (B) RQ2. What are the theoretical and intellectual foundations of research on social media and journalism? RQ2.1. What are the most cited works? RQ2.2. Who are the most cited authors? RQ2.3. What is the intellectual structure of the field according to co-citation analysis? (C) RQ3 What is the topical structure of the field? RQ3.1. What are the most researched topics in the subject? RQ3.2. What specific platforms and services have been most surveyed? RQ3.3. What methods are most commonly used? RQ3.4. What are the most researched geographical scenarios?
Once the previous questions have been answered the paper also aims to draw and outline future lines of research on the subject.

Methodological Procedures
The retrieval was limited to articles published in journals indexed by Journal Citation Reports in order to include only the highest quality standards research available at Clarivate's Web of Science. Documents were retrieved following the keywords employed in other works Van Ochs, 2014;Segado;Grandío;Fernández-Gómez, 2015): "social media" OR "social medium" OR "social networking site" OR "social network site" OR "online social network" OR "social network sites" OR "social networking sites" OR "Facebook" OR "Twitter" OR "online social networks" OR "Myspace", combined with the specific topic keyword "journalism".
The time lapse considered starts in 2003, the year the first massive successful social networking site (Myspace) was created and ends in 2017, the last complete year available at Web of Science database at the time when this study was carried out (July 2018).
A first collection of 264 documents was obtained. In a following step, the abstracts of those articles were screened. Those documents which did not deal directly with both topics of the field of study (journalism and social media) were discarded. The final sample was reduced to 213 items.  Additionally, the articles were submitted to a manual content analysis which considered the following variables: (1) Methods: included the following categories: quantitative content analysis, in-depth interviews, surveys, focus groups, experiments, observation, case studies, qualitative content analysis and bibliometrics and reviews. After the pre-test, two additional categories were considered: social network analysis and use of social media metrics. Methods like digital ethnography or interventions, which did not reach a minimum amount of five papers were included in the broader category "Other". Works with no methods, like essays and theoretical reflections were not assigned to any of the previous categories; (2) Geographical scenario: this variable refers to the cultural and geographical precedence of the object studied and the gathered data. For example, if an article addressed sourcing practices of Belgian journalists when reporting Arab Spring events, it was considered that the scenario was Belgium, not Tunisia nor Egypt. This variable considered all the countries and nationalities which were specifically mentioned at the methods section of each paper. Data was recoded according to geographical areas: Europe, North America (United States and Canada), Latin America (including Central and South America, as well as Mexico), Africa, Oceania and Asia. An additional category, "global", gathered studies whose sample or approach was not limited by geographical criteria; (3) Platform: the specific social media platform analyzed by each study was also recorded. This variable includes the categories: "Twitter", "Facebook", "Blogs", "Instagram" and "None" for those papers which did not consider a specific platform at all. A final category ("Other") included different sites such as particular political forums or local and regional oriented services such as the Chinese platform Weibo; (4) Topic: categories were identified following the International Communication Association divisions and interests groups and the main topics in the Communication field (Montero-Díaz et al., 2018). Some categories were excluded as they were considered too generic, such as "Journalism Studies", "Communication and Technology" or "Mass Communication". An additional category was included ("Professionalism"), to include issues related with journalistic practices, roles, values and education, which were not considered by the initial categories.
All four variables considered non excluding categories, as one article might use more than one method or deal simultaneously with several platforms, scenarios or topics. Abstracts were read to identify the information listed above. In case that any of the required field was not included in the abstract, full text was retrieved and scanned for it.

Results
Even though social media started their popular success in 2003, the first papers on social media and journalism do not appear until 2009. Despite this late start, the field has experienced an annual growth rate of 52%, skyrocketing from 2 papers in 2009 to 57 articles in 2017.  (14), Journalism (13) and New Media & Society (10).
The most productive researcher is A. O. Larsson (6 articles) followed by A. E. Holton & L. Molyneux (4 articles each) and P. Bakker, M. Coddington, P. English, A. Hermida & S. C. Lewis (3 articles each). According to the affiliation of each corresponding author, production is dominated by scholars from the United States (72 articles), Australia (17), England (16) and Spain (13). (2012) is the most cited paper on the collected sample and the second most cited work by the analyzed production (Table 1). Its influence is not limited to papers on social media and journalism but is also widely cited in other fields.

Lasorsa, Lewis and Holton
Co-citation network analysis (Figure 1) shows three clusters built respectively around Hermida (2010), Larsosa, Lewis and Holton (2012) and Domingo et al. (2008). Each node represents a document which size is proportional to the number of cites it has received.
The first cluster, represented in dark blue, includes concepts related to audience participation or "participatory journalism" (Domingo et al., 2008) such as "Citizen journalism" (Gillmor, 2006) or "produsage" (Bruns, 2008). This cluster also incorporates sociological concepts such as the public sphere (Habermas, 1989). This cluster focuses on how citizens participate in the news processes (production, distribution) and how they discuss news on the wider societal level.
The concept of "ambient journalism" (Hermida, 2010) is central to the second cluster, depicted in green. It includes references to how user-generated content is taking part of the journalistic processes (e.g., Papacharissi;Oliveira, 2012). To a certain extent, this cluster could be understood as a subset of the dark blue cluster. The green cluster is focused on the consequences of social media and participation on journalism narratives and content. The dark blue one focuses on civic or democratic consequences of such participation.
The third cluster, colored in light blue, is dominated by Lasorsa, Lewis and Holton (2012). It draws attention at how journalists are using social media in their work (e.g., Vis, 2013) . It focuses on the influence of social media on the journalistic profession and its practices such as sourcing (Broersma;Graham, 2012) or news selection (Gans, 1979) and its identity or ideology. In this connection, Deuze (2005) plays a relevant role by linking this cluster with the previous one. In fact, the three clusters are frequently linked among them, showing a tightly intertwined field of study where concepts and constructs are commonly shared and referenced.
Keyword co-occurrence analysis shows again three different clusters closely connected among them ( Figure  2). Each node represents a different keyword. Its size is proportional to the number of times each keyword appears in the sample.
The first cluster, depicted in dark blue, is organised around the idea of "participation" of users in the process of news production, as it includes the specific terms "user generated content" and "citizen journalism". This cluster also gathers concepts associated with democracy and the specific idea of "public sphere". Thus, it is concerned about the effects of participation on civic life and democratic process. This fact links this cluster with the first one identified in Figure 1.
The second cluster (light blue) includes the nodes with the highest number of documents. They represent generic topics and issues ("news", "internet", "journalism"). Around those nodes, nuclear to journalism studies, other specific topics and keywords of social media ("social media") and the two most studied specific platforms ("Twitter" and "Facebook") appear. A third set of smaller nodes, appearing on the periphery, can be interpreted as more specific issues of the cluster interest.
Last, three single nodes ("journalist", "news media" and "power") constitute a third, smaller cluster referring to the roles which individual journalists play in the news media environment and their (dis)empowerment.
These clusters detected by co-citation and co-word analysis can also be linked to the three main topics identified by content analysis (Table 2). Thus, "news" is linked to the second cluster identified in Figure 2. Meanwhile "professionalism" is directly related to the third cluster depicted in Figure 1, focused on journalistic practices. "audiences" appear as a topic intertwined in the cluster organised around "participation" keyword ( Figure 2) and the two first clusters identified in Figure 1.
Smaller nodes can be traced to less frequent variables identified by content analysis. For instance, the "organizations" keyword could be identified with the variable "organizational communication/public relations" in the content analysis (Table 2). "Organizations" is linked to the use of social media by journalistic and media companies, such as their use as content promotion venues or as the institutional guidelines or instructions drawn to drive the use of social media by individuals.
"Coverage" makes reference to the analysis of how certain issues are portrayed in social media companies portraits, for example, coverage of political campaigns. This node can be identified with the variable "discourse", as it surveys texts or discussions spread through social media. In this sense, the presence of "campaign" indicates the relevance of political and electoral communication in the field, in agreement with the presence of "political communication & public opinion" topic (Table 2).
Regarding the platforms considered, most of the papers have dealt with a general concept of "social media" with no specific platform in mind or well focused on Twitter (Table 3). Anyway, if yearly evolution is taken into account, 2016 meant a turning point in both trends. In 2017 Facebook appears as the most studied social media service.
The most popular research method has been quantitative content analysis, followed by in-depth interviews. From the geographical point of view, studies are dominated by European scenarios, objects and subjects (Table 3)

Discussion
Journalism studies about social media have grown in the last decade, not only in inclusive databases as Google Scholar (Lewis; Molyneux, 2018) but also in selective indexes such as the Journal Citation Reports. Previous research  has concluded that social media studies lack intellectual maturity and suffer from the Mathew Effect. The former happens as well in social media studies regarding journalism. Just a small set of papers receives the highest number of citations in and outside the field (Table 1). The conceptual foundations of the field are restricted to less than eight references. The field still needs to grow a wider and diverse set of works to develop its conceptual and theoretical frameworks.
Yet, research on social media and journalism shows some trends of maturity. Where social media studies frequently make reference to professional or industrial documents , journalism studies on social media reference mostly research papers published in peer reviewed journals. Its foundations, though reduced, come from scholarly research, not from the business sphere.  None   81  1  2  2  3  11  10  28  Another sign of maturity can be seen at the shift of focus from the general concept of "social media" to particular platforms (Table 3). These studies started considering a generic, monolithic, idea of "social media", but the field now considers the specificity and particularities of each social media service. In this connection, the field follows Journalism studies, which have recognized that specific social media platforms offer specific features and consequently cause different effects (Hasell;Weeks, 2016). This trend can also be seen in the rise of studies which consider platforms other than the most popular, consolidated, mainstream services (Table 3). Yet, even though new platforms such as Instagram or Snapchat are growing and attracting new audiences such as younger users (Larsson, 2018), journalism studies still basically ignore such platforms.
Data employed in the field shows a western bias. It might not be surprising, considering that this same bias is present in social media (Stoycheff et al., 2017) and in journalism (Hanitzsch, 2019) studies. Meanwhile social media studies are mostly based on data from the United States (Valenzuela;Piña;Ramirez, 2017), studies on social media and journalism focus on European settings. Anyway, research on journalism and social media lacks evidences from other scenarios such as Latin America, Africa or Asia.
Social media has changed the model of news distribution from a transmission line to a dissemination network where users act as nodes (Carlson, 2016). In this new environment users spread news by deciding which news articles they share with their contacts on their different social media profiles. Social media and journalism studies need to take some steps to address more accurately this new environment.
First, they could incorporate social network analysis into their background. Social network analysis has been applied outside journalism studies to analyze similar gatekeeping processes on multi-directional interactions (Welbers;Opgenhaffen, 2018). So, greater emphasis on this method could benefit the field and open new research opportunities.
Similarly, previous research (Schweisberger;Billinson;Chock, 2014) has noted that the individuals' news content sharing behavior might vary depending on the different topics covered by said news. Then, it is necessary to investigate other kinds of journalistic information beyond political communication, which dominates the research (Table 2).
Last, the way users select and distribute news content through social media is also influenced by visual components (Harcup;O'Neill, 2017). Then, it is needed to incorporate approaches from visual communication, which are still marginal in this field (Table 2).

Conclusion
Social media and journalism is defined as a field with three main topical and intellectual clusters: audience participation, user generated content and the influence of social media on journalistic professional values and practices. These clusters are closely linked among them through frequent common references and concepts.
Journalism studies on social media show signs of maturity. For example, they no longer consider social media as a monolithic reality. Instead they have realized that each platform needs specific approaches. In a similar sense, researchers are widening their scope beyond consolidated mainstream platforms such as Facebook or Twitter.
The field is dominated by political information. The relationship between social media and other branches of journalism, such as Science, Sport or Health journalism is still largely unexplored. Scholars need to address how journalists and citizens use social media to create, consume and distribute news about topics other than political information.
A mixed methodological approach characterizes the field. Messages and news shared on social media are analyzed from a quantitative perspective. When focus is placed on users or journalist's attitudes and behaviors, INVESTIGATION  qualitative methods are applied instead. In this sense, in-depth interviews are the most commonly used tool. This qualitative approach could be expanded or complemented by quantitative methods (surveys) which would allow to measure phenomena, trends, attitudes and other constructs identified by in-depth interviews. Further use of surveys could provide better understanding of how users and journalist alike behave on different social media services.
The conclusions drawn here are limited to research published in journals indexed by the Journal Citations Report. Future studies might compare the results of this study with data from other databases such as Scopus or even Google Scholar.