Seven Reasons Why: A User ’ s Guide to Transparency and Reproducibility*

Despite a widespread agreement on the importance of transparency in science, a growing body of evidence suggests that both the natural and the social sciences are facing a reproducibility crisis. In this paper, we present seven reasons why journals and authors should implement — transparent guidelines. We argue that sharing replication materials, which include full disclosure of the methods used to collect and analyze data, the public availability of raw and manipulated data, in addition to computational scripts, may generate the following positive outcomes: 01. production of trustworthy empirical results, by preventing intentional frauds and avoiding honest mistakes; 02. making the writing and publishing of papers more efficient; 03. enhancing the reviewers ’ ability to provide better evaluations; 04. enabling the continuity of academic work; 05. developing scientific reputation; 06. helping to learn data analysis; and 07. increasing the impact of scholarly work. In addition, we review the most recent computational tools to work reproducibly. With this paper, we hope to foster transparency within the political science scholarly community.

To make the problem even worse, there is no agreement as to what transparency, reproducibility and replication mean exactly (JANZ, 2016).Scholars from different fields use these concepts without a common understanding of their definition (PATIL, PENG and LEEK, 2016, p. 01).For example, King (1995) argues that "the replication standard holds that sufficient information exists with which to understand, evaluate, and build upon a prior work if a third party could replicate the results without any additional information from the author" (KING, 1995, p. 444).Jasny et al. (2011) define replication as "the confirmation of results and conclusions from one study obtained independently in another" (JASNY et al., 2011(JASNY et al., , p. 1225)).
According to Hamermesh (2007), replication has three perspectives: pure, statistical and scientific.Seawright and Collier (2004) argue that "two different research practices are both called replication: a narrow version, which involves reanalyzing the original data, and a broader version based on collecting and analyzing new data" (SEAWRIGHT and COLLIER, 2004, p. 303).The plurality of definitions goes on 1 .______________________________________________________________________________________________ 1 For example, "all data and analyses should, insofar as possible, be replicable (…) only by reporting the study in sufficient detail so that it can be replicated is it possible to evaluate the procedures followed and methods used" (KING, KEOHANE and VERBA, 1994, p. 26).Herrnson (1995) argues that replication, verification and reanalysis have different meanings.In particular, he states that "replication repeats an empirical study in its entirety, including independent data collection (…) replication increases the amount of information for an empirical research question and increases the level of confidence for a set of empirical generalizations" (HERRNSON, 1995, p. 452).

T
Dalson Figueiredo Filho, Rodrigo Lins, Amanda Domingos, Nicole Janz & Lucas Silva (2019) 13 (2) e0001 -3/37 The meaning of reproducibility is also mutable.Goodman, Fanelli and Ioannidis (2016) argue that the current use of reproducible research was initially employed in computational science not for corroboration but transparency.
They also propose a new lexicon for research reproducibility in three dimensions: methods, results and inferences.The U.S. National Science Foundation (NSF) defines reproducibility as "the ability of a researcher to duplicate the results of a prior study using the same materials as were used by the original investigator" (NSF, 2015, p. 06).Regarding transparency, the terminology is also ambiguous.For example: according to Janz (2016), "working transparently involves maintaining detailed logs of data collection and variable transformations as well as of the analysis itself" (JANZ, 2016, p. 02).Moravcsik (2014) defines transparency as "the principle that every political scientist should make the essential components of his or her work visible to fellow scholars" (MORAVCSIK, 2014, p. 48).He also identifies three dimensions of transparency: data, analytic and production.Although these conceptual disagreements seem to be mainly semantic, they may also have scientific and policy implications (PATIL, PENG and LEEK, 2016).Therefore, to avoid conceptual misunderstandings, we adopt the following definitions: Set of computational functions (scripts) that allows to exactly reproduce the observed results from raw data (PATIL, PENG and LEEK, 2016).
The full disclosure of the research design, which includes the methods used to collect and analyze data, the public availability of both raw and manipulated data, in addition to the computational scripts employed along the way.
Source: Elaborated by the authors.
Replication research, following Gary King's definition (1995), involves often as a first step using the same data with the same statistical tools (this first step aims at verification of research could also be called 'duplication').However, a full features meant to enhance reproducibility: 01.data access; 02.data gathering description and 03.details on data transformation and analysis.In this paper, the concept of reproducibility is closely related to computational functions that allow the researcher to reproduce the reported results exactly using the raw data (PATIL, PENG and LEEK, 2016).
Lastly, we should define transparency.In physics, the concept of transparency refers to objects that allow the transmission of light without appreciable scattering so that bodies lying beyond are seen clearly.Different from translucent and opaque, transparent ways produce regular and well defined trajectories (SCHNACKENBERG, 2009).In scientific parlance, transparency means full disclosure of the process by which the data are generated and analyzed.
According to King, Keohane and Verba (1994), "without this information we cannot determine whether using standard procedures in analyzing the data will produce biased inferences.Only by knowing the process by which the data were generated will we be able to produce valid descriptive or causal inferences" (KING, KEOHANE and VERBA, 1994, p. 23).Miguel et al. (2014) argue that transparent social science covers three core practices: "disclosure, registration and pre-analysis plans, and open data and materials" (MIGUEL et al., 2014, p. 30).Following this reasoning, we define transparency as the full disclosure of the research design, which includes the methods used to collect and analyze data, the public availability of both raw and manipulated data, in addition to the computational scripts employed along the way 2 .
From now on, we should employ replication materials as an empirical indicator to represent the concept of transparency.
In this paper, we present seven reasons why journals and authors should adopt transparent guidelines.We argue that sharing replication materials, which include full disclosure of the methods used to collect and analyze data, the public availability of both raw and manipulated data, in addition to the computational scripts that were used, may generate the following positive outcomes: 01.
______________________________________________________________________________________________ 2 According to King, Keohane and Verba (1994), "if the method and logic of a researcher's observations and inferences are left implicit, the scholarly community has no way of judging the validity of what was done.We cannot evaluate the principles of selection that were used to record observations, the ways in which observations were processed, and the logic by which conclusions were drawn.We cannot learn from their methods or replicate their results.Such research is not a 'public' act.Whether or not it makes good reading, it is not a contribution to social science" (KING, KEOHANE and VERBA, 1994, p. 08).
enabling the continuity of academic work; 05.developing scientific reputation; 06.
helping to learn data analysis; and 07.increasing the impact of scholarly work.
Also, we review the most recent computational tools to work reproducibly.We are aware that this research note is no substitute for a careful reading of primary sources and materials of a more technical nature.However, we believe that scholars with no background on the subject will benefit from a document having predominantly pedagogical goals 3 .
To the best of our knowledge, this paper provides the first attempt to bring both transparency and reproducibility to the Brazilian political science research agenda 4 .Generally speaking, scholars in the United States and Europe produce the majority of work on the topic.Now is the time to expand scientific openness beyond the mainstream of our discipline.Also, we believe that we are on the right track, after the program of the XI Meeting of the Brazilian Political Science Association (ABCP) 5 was made public.For the first time, the program had a session ______________________________________________________________________________________________ 3 Although our recommendations are focused on quantitative, variance-based methods, there is a growing l i te ra t ure o n t r a ns pa re nc y in qua l ita t ive re se a rch .For instance, Moravcsick (2014) examines the main c ha l l e nge s po se d f or qua l ita ti ve scholars.Elman and Kapiszewski (2014) evaluate if and how qualitative pundits can disclose more about the processes through which they generate and analyze data.To this end, the American Political Science Association (APSA) has produced the Guidelines for Data Access and Research Transparency for Qualitative Research in Political Science.In addition, the Qualitative Transparency Deliberations (˂https://www.qualtd.net/˃)"was launched in 2016 to provide an inclusive process for deliberation over the meaning, costs, benefits, and practicalities of research transparency, openness in qualitative political science empirical research".We are thankful to the referee of the BPSR on this matter.See also ˂https://www.qualtd.net/page/resources˃. 4 We reviewed all articles published in four top Brazilian national journals between 2010 and 2017 and found no publication that deals with replication, reproducibility and transparency.We reviewed articles published in the following journals: 01.DADOS; 02.Brazilian Political Science Review; 03.Revista de Sociologia e Política and 04.Opinião Pública using 'replication', 'transparency' and 'reproducibility' as keywords in both Scielo and Google Scholar.As far as we are concerned, there is only one Brazilian political science journal that has a strict policy of data sharing before publishing the papers, namely, the Brazilian Political Science Review.See <http://bpsr.org.br/files/archives/database.html>.According to our keywords search, it seems that Galvão, Silva and Garcia (2016) produced the only paper on the subject published in a Brazilian journal.We also found a blog post from Scielo which is available at <http://blog.scielo.org/en/2016/03/31/reproducibility-in-research-results-the-challenges-ofattributing-reliability/#.Wn7IOudG02w>.______________________________________________________________________________________________ 6 We are in debt to the BPSR referee who pointed out that our focus should be on journals' policies rather than on author ethos.According to her review, "it is not a problem of researchers' goodwill, but a matter of replication policies adopted by the journals, which are the gatekeepers of publications". 7See ˂https://www.dartstatement.org/˃.
Dalson Figueiredo Filho, Rodrigo Lins, Amanda Domingos, Nicole Janz & Lucas Silva (2019) 13 (2)  e0001 -7/37 Reason 01.Replication materials help to avoid disaster Scientific frauds are underestimated (FANG et al., 2012).From medicine to physics, from chemistry to biology, it is not hard to find cases of scientific misconduct (FANELLI, 2009).For example, Baggerly and Coombes (2009) demonstrate how the results reported by Potti et al. (2006) vanish after some spreadsheet problems are fixed.Also, they provide evidence of data fabrication in the original study.As Young and Janz (2015) argued, "a veritable firestorm hit political science" (YOUNG and JANZ, 2015) when the LaCour and Green paper was retracted by Science due to the alleged use of fictional data.At the time , LaCour was a PhD student at the University of California at Los Angeles (UCLA).
Using a field experiment, they reported that conversations with gay canvassers produce significant changes in voters' opinions on gay marriage 8 .The paper was very influential but failed to be reproduced by two doctoral students from the University of California at Berkley because, as it turned out later, the data was fabricated 9 .The LaCour case was a wakeup call for political scientists against scientific misconduct.According to Simonsohn (2013), "if journals, granting agencies, universities, or other entities overseeing research promoted or required data posting, it seems inevitable that fraud would be reduced" (SIMONSOHN, 2013, p. 1875) 10 .Thus, reproducible mandatory policies on reproducibility can reduce the likelihood of scientific frauds.
Besides preventing intentional deception, reproducible research also avoids honest mistakes.Again, there are many examples of how minor errors can jeopardize an entire research enterprise.One of the most well-known relates to Reinhart and Rogoff's paper (2010), where a small error in the data Excel spreadsheet compromised the empirical results.As King (1995) noted more than 20 years ago "the only way to understand and evaluate an empirical analysis fully is to know the exact process by which the data were generated and the analysis ______________________________________________________________________________________________ 8 More detailed description of the case can be found here: <http://www.chronicle.com/article/What-Social-Science-Can-Learn/230645/>. 9See: <http://www.chronicle.com/article/We-Need-to-Take-a-Look-at/230313/>. 10 This is not to say that reproducibility can always avoid scientific misconduct.Researchers can fabricate data and use the correct codes, but it is also possible to create fake data with forged scripts (which hopefully is not a widespread practice).We are thankful to BPSR reviewers on this specific matter.
Seven Reasons Why: A User's Guide to Transparency and Reproducibility (2019) 13 (2)  e0001 -8/37 produced" (KING, 1995, p. 444).In 2016, Imai, King and Rivera found discrepant data in the work of Ana de La O (2013; 2015).In both of her works, De La O found a partisan effect of a nonpartisan programmatic policy.However, Imai, King and Rivera (2016) showed that her dataset contained data observations that are likely to be labeled as outliers.Figure 02 reproduces this information.
For example, the turnout variable only varies from 0 to 100.When Imai, King and Rivera (2016) corrected the data, the reported effects vanished.If the dataset had been publicly available from the start, any student with minimal statistical training would have been able to spot Ana de La O's mistakes.Similarly, if all journals adopted a strict reproducibility policy that required both data and code, the reviewers would probably be the first to spot the mistake.Therefore, the first reason why journal editors should require reproducible materials is to prevent frauds and reduce mistakes.

Reason 02. Transparency makes it easier to write papers
Journals should adopt transparency guidelines because it facilitates paper writing for their authors.In political science, King (1995King ( , 2006) ) initially emphasized this rationale.Creating replication materials will force the researcher to plan every step of their work.This will help in having a clearer idea of the research design, as well as knowing exactly where the data is stored and how it was managed.
This would help at the time of the research, but also for future inquiries.On the aggregate, better individual papers would lead to a more robust collective scientific endeavour (MARKOWETZ, 2015).Ball and Medeiros (2012) explain that creating replication materials will make the researcher readily aware of potential problems and, consequently, submit more robust research to journals.They argue that when young scholars know that they will need to make their data management public, "their data management and analysis tend to be much more organized and efficient, and their understanding of what they are doing tends to be much greater than when they use their statistical software to execute commands interactively" (BALL and MEDEIROS, 2012, p. 188).
We also adopted replication exercises recently in our graduate (PhD) courses and have observed exciting results.Students are more involved and show higher quality final papers.For example, in 2015, two of our students started a replication paper project that went on to publication in a Qualis A2 journal 12 .The same effect was reported by King (2006).
Replication also enhances researchers' writing.rigorous review is to "support the highest quality science, public accountability, and social responsibility in the conduct of scientific research" (NIH, 2018, p. 02).
Moreover, the NIH (2018) expects the reviewers to pay attention to relevant variables in the area of research.Markowetz (2015) argues that reproducibility is vital to increase reviewers' contribution to our work.We agree with him.Also, in order to suggest a new control variable or a more appropriate model specification, reviewers can work with the data themselves.Reviewers also can contribute to our coding by suggesting better scripts.We are not so naïve as to expect a full commitment of all reviewers in reanalyzing our data, but if both data and code are available from the start, it is easier to get better peer evaluations.them easily accessible" (MARKOWETZ, 2015, p. 03).According to King (1995), "without adequate documentation, scholars often have trouble replicating their own results months later" (KING, 1995, p. 444).Finifter (1975) developed a taxonomy of nine replication strategies "to clarify and codify how replications contribute to confidence and cumulative advances in substantive research" (FINIFTER, 1975, p. 120).Goodman et al. (2015) offer a short guide on how to increase the quality of scientific data.Specifically, they propose ten rules to facilitate data reuse 14 .In 2016, Science published the following Editorial expression of concern: In the 03 June issue, Science published the Report "Environmentally relevant concentrations of microplastic particles influence larval fish ecology" by Oona M. Lönnstedt and Peter Eklöv.The authors have notified Science of the theft of the computer on which the raw data for the paper was stored.These data were not backed up on any other device nor deposited in an appropriate repository.Science is publishing this Editorial Expression of Concern to alert our readers to the fact that no further data can be made available, beyond those already presented in the paper and its supplement, to enable readers to understand, assess, reproduce or extend the conclusions of the paper" (BERG, 2016, p. 1242).
An unfortunate example shows how reproducibility enables continuity of academic work.In September 2017, Noxolo Ntusi fought against two thieves to protect her master's thesis.She explained that all the relevant information for her research was stored in an HD drive in her bag 15 .Likewise, in November 2017, a PhD student offered a $ 5,000 reward for his data stolen in Montreal.He said, "I have a backup of the raw data left on equipment in the lab, but I do not have a backup of the data analysis that I have been doing for the last few years" (MONTREAL GAZETTE, 2017).In our own personal experience, we have observed a similar case -a student who used to keep the only copy of her dissertation in a 1.44 floppy disk.
At some point, the file got corrupted, and she lost all her work.We can easily avoid ______________________________________________________________________________________________ 14 Rules: "01.Love your data and help others love it too; 02.Share your data online, with a permanent identifier; 03.Conduct science with a particular level of reuse in mind; 04.Publish workflow as context; 05.Link your data to your publications as often as possible; 06.Publish your code (even small bits); 07.State how you want to get credit; 08.Foster and use data repositories; 09.Reward colleagues who share their data properly and 10.Be a booster for data science" (GOODMAN et al., 2015). 15<http://www.bbc.com/portuguese/internacional-41261885>Video of her fighting is available at <https://www.youtube.com/watch?v=gj7bhJpYEqM>.
Dalson Figueiredo Filho, Rodrigo Lins, Amanda Also, following Markowetz's (2015) remarks, if you ever have a problem with your data, "you will be in a very good position to defend yourself and to show that you reported everything in good faith" (MARKOWETZ, 2015, p. 04).For ______________________________________________________________________________________________ 16 We should note that BPSR, AJPS and APSR have different replication policies.While the BPSR only requires data sharing, AJPS requires both data and code which is then replicated by journal staff members.It is important to share both data and code.We are thankful to the BPSR referee for this warning.In this case, Levitt argued that Lott's results were not reproducible and the plaintiff understood this passage as an accusation of academic dishonesty 17 .Time, energy and resources could have been saved if Lott's replication materials had been made publicly available.

Seven Reasons
In short, transparency adds two gains to the reputation of journals as well as authors: the readers of your work will know that you took the time to make all your documentation available, thus giving it more credibility, and you are shielded and more likely to be protected from any misconduct charges, given that you did everything in good faith (and honest errors are human).Again, we should not rely solely on the authors' goodwill.Journal editors play a crucial role in this process, since they can change the incentives and induce open research practices by adopting mandatory editorial replication policies.For journals, the reputational benefit is clear: they can avoid negative publicity when an article published by them fails to replicate, must be corrected or even retracted.
Reason 06.Replication materials help to learn data analysis King (1995) initially stressed this reasoning: "reproducing and then extending high-quality existing research is also an extremely useful pedagogical tool" (KING, 1995, p. 445).According to Janz (2016), replication provides students with "a better way to learn statistics: Replication is essential to a deeper understanding of statistical tests and modeling.The advantage over textbook exercises is that students use real-life data with all bugs and complications included" Suppose that you are learning about data transformation.The primary purpose of your statistics professor is to teach you how to improve the interpretability of graphs when dealing with skewed data (see Figure 03).He shares with you a dataset on animals that have only two variables (body weight and brain weight) and asks what the best mathematical function to be applied in order to get a normal distribution is.
After some discussion, you learn that the natural logarithmic transformation is the right answer.Now imagine that the same professor handles your data on the relationship between campaign spending and electoral outcomes.In which scenario are you more likely to engage in active learning?On average, we would expect political science students to be more interested in money that flows to the political system than on animals' biological features.Following Janz's ( 2016) reasoning, we believe that professors can foster a transparent culture by adopting exercises into their academic activities.How exactly are learning purposes related to journal editorial policies?We argue the following: First of all, students can learn from replicating published work much easier when the authors provide their materials.Second, if journals required mandatory replication materials, both students and experienced scholars would have to change their behavior when thinking about publishing.Students, particularly, would develop higher data analysis skills that are likely to improve the overall quality of paper submissions.In the end, journal editors would receive higher quality submissions and scientific endeavor would thrive.

Reason 07. Replication materials increase the impact of scholarly work
There are few events more frustrating in academic life than being ignored by the profession, and empirical evidence suggests that "the modal number of citations to articles in political science is zero: 90.1% of all articles are never cited" (KING, 1995, p. 445).In other words, we write for nobody.We attend academic conferences, we apply for both internal and international grants, and we revise and resubmit the same paper as much as necessary in order to get it published in a high impact journal.Why do we invest time, resources and pride to get our message out there?Right answer: to advance scientific knowledge.
We argue that transparency is a crucial resource to increase the impact of scholarly work.According to Gleditsch, Metelits and Strand (2003), papers that share data are twice more cited compared to those that do not.Similarly, Piwowar, Day and Fridsma (2007) reported that public data availability is associated with a 69% increase in citations, controlling for other variables.According to King (1995), "an article that cannot be replicated will generally be read less often, cited less frequently, and researched less thoroughly by other scholars" (KING 1995, p. 445).
Ghergina and Katsanidou (2013) found a positive association between impact factor and the likelihood of adopting a transparent editorial policy, after controlling for age of the journal, frequency, language and type of audience.More recently, Strand et al.
(2016) estimated an unconditional fixed effects negative binomial regression model with a sample of 430 articles from the Journal of Peace Research.They found that sharing data increases citations, even controlling for scholars' name recognition.
According to a Nature editorial on transparency in science, "the benefits of sharing data, not only for scientific progress but also for the careers of individuals, are slowly being recognized" (NATURE GEOSCIENCE, 2014, p. 777).Table 02 summarizes the guidelines of the Transparency and Openness Promotion (TOP).
Following academic literature, it seems that Level 03 citation standards would lead to higher academic impact.Two interesting examples illustrate how data sharing can enhance the impact of scientific work.The first comes from Paasha Mahdavi.According to his Google Scholar profile, the most cited work of his career is not a paper but a dataset18 .Overall, 48.10% of all his citations are concentrated in Oil and Gas Data (1932Data ( -2014)).Similarly, John M. Powell designed his most influential paper to disseminate a time series cross-section dataset on military ______________________________________________________________________________________________ (2019) 13 (2) e0001 -20/37 addition, more citations for authors are naturally also good for the journals that publish their work.
After explaining why journal editors should adopt more open research policies, the next step is to present both tools and techniques.

Tools for computational reproducibility
This section presents some of the most recent tools to work transparently.
Given the limitations of an essay, we briefly review the main features of each tool.

TIER Protocol
At the lowest level, you should adopt some documentation protocol to organize your files into folders.We suggest the latest version of the TIER Protocol  For security reasons , upload a copy at some online environment such as Dropbox, Google drive and so forth .The Open Science Framework (we will talk about them later) is an online tool that fits the protocol model perfectly.
For obvious reasons, "do not spread your data over different servers, laptops and hard drives" (MARKOWETZ, 2015, p. 05) 20 .

Git and Github
Git is a version control system (VCS) for tracking changes in computer files and coordinating work on those files among many people.Its primary purpose is software development, but it can keep track of changes in any files.Markowetz (2015) also suggests using Docker 21 , which allows self-contained analysis and is easily transportable to other systems.An alternative tool to Git and Github is Mercurial, which is free and "handles projects of any size and offers an easy and intuitive interface" (GOODMAN et al., 2015, p. 05).
______________________________________________________________________________________________ 20 In our teaching experience we came across a case of a master student that stored his 'analysis data' in his ex-girlfriend's laptop.Regardless of how they broke up, we strongly advise him to upload his data to an online environment as soon as possible.

Pre-analysis plan (PAP)
A pre-analysis plan is a detailed outline of the analyses that will be conducted (CHRISTENSEN and SODERBERG, 2015, p. 26).The purpose of PAP is to reduce the incidence of false positives (SIMMONS et al., 2011)  For those that wish to have a step-by-step view of a PAP, we recommend the website 'AsPredicted' 35 .There, it is necessary to answer nine questions by filling in boxes.The questions include hypothesis (including the direction of the causal effect and its strength); dependent variable; outliers and exclusions, among others.
Similarly, the American Economic Association sponsors the AEA RCT Registry 36 , which allows scholars to pre-register their Randomized Controlled Trials (RCT) freely.
We are not saying that these tools have to be learned all at once.Each tool requires time, and some of them display a higher learning curve, such as R Markdown.An important step is to teach these tools in both undergraduate and graduate courses, so students can learn as they acquire skills in other subjects.We firmly believe that if these tools are included in the researchers' methodological Where to learn more about it?
Many sources can help scholars to learn more about reproducibility and transparency.Here, we 'emphasize' some of those with a specific focus on political science empirical research.
Data access and research transparency (DA-RT) 37   In 2012, the American Political Science Association (APSA) Council

Conclusion
This article presents seven reasons why journals should implement transparent guidelines, and what the benefits for authors are.
We have argued, in particular, that replication materials should include full disclosure of the methods used to collect and analyze data, the public availability of both raw and manipulated data, in addition to the computational scripts.Also, we have described some of the tools for computational reproducibility and reviewed different learning sources on transparency in science.We are aware that transparency does not guarantee that empirical results will be free from intentional frauds or honest mistakes.However, it makes both much less likely, since openness increases the chances of detecting such problems. After replication also collects new data to test the same hypothesis, or includes a new variable to test the same model.More broadly, we may define replication as the process by which a published article's findings are re-analyzed to confirm, advance or challenge the robustness of the original results (JANZ, 2016).The American Political Science Association (APSA) guidelines highlight the following Seven Reasons Why: A User's Guide to Transparency 5 ˂https://cienciapolitica.org.br/sites/default/files/documentos/2018/07/programacao-xiencontro-abcp-2018-1371.pdf˃Seven Reasons Why: A User's Guide to Transparency and Reproducibility (2019) 13 (2) e0001 -6/37 on Replication and Transparency led by professors Lorena Barberia (USP) and Marcelo Valença (UERJ).George Avelino and Scott Desposato, in particular, presented a paper on the subject focusing on the Brazilian case.The remainder of the article consists of four sections: the one following this, which explains the benefits of creating replication materials; then we describe some of the tools for reproducible computational research; we proceed by examining different sources where one can learn about transparent investigation, and we conclude by suggesting what could be done to foster openness within the political science scholarly community.Seven reasons to require replication materials 6 Figure 01 shows what journal editors and authors should avoid.Transparent research requires full disclosure of the procedures used to obtain and analyze data as pointed out by the Data Access and Research Transparency protocol 7 .The higher the transparency, the easier it is to replicate and reproduce the results.In what follows, we review Markowetz's (2015) reasoning on the importance of transparency as a standard scientific practice and we add two more motives why scientific journals should require reproducible materials, and why authors should follow these guidelines.

Figure 01 .
Figure 01.The miracle of Science

Figure 02 .
Figure 02.Reproducibility and outlier issues For example, the journal Political Science Research and Methods (PSRM) and the American Journal of Political Science (AJPS) have a strict replication policy 13 .To have a paper accepted, authors must share replication materials in advance (data and code).The journal's staff then needs to rerun the analysis and get the same results (tables and figures).Only then do they allow the paper's publication.If, for any reason, the results are not reproducible, the author must update data and codes.In this process, reviewers also evaluate the quality of replication materials, which is likely to increase their overall contribution to enhancing the paper.Therefore, when journals require the submission of replication materials already at the peer-review stage, they can improve the quality of such reviews.Reason 04.Replication materials enable the continuity of academic workWhen journals adopt transparency guidelines, they support continuity of academic work for their authors.During the 2014 Berkeley Initiative for Transparency in the Social Sciences (BITSS) workshop, professor Ted Miguel asked who had had problems when trying to replicate their own papers.People slowly started to raise their hands, and we found ourselves in an embarrassing situation:we were all attending a transparency meeting, but most of us did not have a reproducible research culture.Markowetz (2015) points out that we can trace personal reproducibility problems by "documenting data and code well and making ______________________________________________________________________________________________ Seven Reasons Why: A User's Guide toTransparency and Reproducibility   (2019)  13 (2)e0001 -12/37 creating replication materials and storing data, codes and other relevant information in online repositories.Unfortunately, we should not expect a natural behavioral change from authors.We should push for a change in scientific editorial policies, which would be more likely to induce higher compliance.According toStockemer, Koehler and Lenz   (2018), "if we as a discipline want to abide by the principle of research and data transparency, then mandatory data sharing and replication are necessary because many authors are still unwilling to share their data voluntarily or make unusable replication materials available" (STOCKEMER, KOEHLER and LENZ, 2018, p. 04).Reason 05.Replication materials help to build scientific reputationEbersole, Axt and Nosek (2016) conducted a survey of adults (N = 4,786), undergraduates (N = 428), and researchers (N = 313) and found that "respondents evaluated the scientist who produces boring but certain (or reproducible) results more favorably on almost every dimension compared to the scientist who produces exciting but uncertain (or not reproducible) results"(EBERSOLE, AXT and NOSEK, 2016, p. 07).Although transparency is key to confidence in science, most political science academic journals do not adopt strict replication policies.Gherghina and Katsanidou (2013) examined 120 international peer-reviewed political science journals and found out that only 18 (15%) had a replication policy.In Brazil, the Brazilian Political Science Review (BPSR) is the only one that has a data sharing mandatory policy.This institutional decision represents a significant advance for Brazilian editors, and other journals should emulate this decision shortly.Internationally, both the American Journal of Political Science (AJPS) and American Political Science Review (APSR) have recently adopted transparent data sharing policies 16 .
2009 the U.S. Court of Appeals upheld a district court decision to dismiss John R. Lott's claims against Steven Levitt.The book Freakonomics mentioned Lott's name, and he filed a defamation suit against Levitt and HarperCollins, the publisher.

(
JANZ, 2016, p. 05).From our teaching experience, we have observed that students are more motivated to work with real data and arriving at the same figures than working with some dull, repetitive homework assignment unrelated to what they are studying.Creating replication materials can foster data analysis learning by providing examples for the students.For instance, Gujarati (2011) ______________________________________________________________________________________________ Dalson Figueiredo Filho, Rodrigo Lins, Amanda Domingos, Nicole Janz & Lucas Silva (2019) 13 (2) e0001 -15/37 published a book called 'Econometrics by Example', where he teaches each topic by showing applied cases.It is easier for the students to learn from examples related to their fields of study than from areas that are alien to them.

Figure 03 .
Figure 03.Body weight and brain weight (3.0), created by the Project Teaching Integrity in Empirical Research (TIER) and based at the Haverford College.The protocol "gives a complete description of the replication documentation that should be preserved with your study when you have finished the project" (TIER, 2017).

Figure 04 .
Figure 04.TIER Protocol Documentation 21 See: <http://www.docker.com/>.Seven Reasons Why: A User's Guide to Transparency (2012), R is starting to become the most popular programming language used for data analysis 22 .The R Project for Statistical Computing (usually named R) is an open-source and free software project widely used for data compilation, manipulation and analysis.The use of free software increases research transparency since the openness allows any observer with some basic knowledge to investigate how the author performed the analyses.In other words, using R reduces the costs of replicating work.Figure 05 illustrates its popularity over time.

Figure 05 .
Figure 05.Software hits approved the changes to the Ethics Guide to facilitate data access and increase research transparency(LUPIA and ELMAN, 2014).Article 06, for example, states that "Researchers have an ethical obligation to facilitate the evaluation of their evidence-based knowledge claims through data access, production transparency, and analytic transparency so that their work can 'be tested' or replicated" (DA-RT, 2012)38 .In addition to 'guiding' scholars, the DA-RT initiative sponsored the election research pre-acceptance competition to foster pre-registration practices among social science scholars 39 .Berkeley initiative for transparency in the Social Sciences (BITSS) 40The BITSS initiative represents a significant effort to diffuse transparency in Social Science research.It provides high-quality training, prizes, research grants and educational materials to "strengthen the quality of social science research and evidence used for policy-making" (BITSS, 2017).
Most notably, BITSS has a catalyst program that formalizes a network of professionals to advance the teaching, practice, funding, and publishing of transparent social science research.According to the latest update, Brazil has only three official catalysts 41 .Project teaching integrity in empirical research (TIER) 42As we have already mentioned, the TIER Project also seeks to promote the integration of principles related to transparency and reproducibility in the research training of social scientists.It offers a development workshop at Haverford College and supports several faculty members with previous experience in incorporating the principles and methods of transparency in courses on quantitative research methods, or in the supervision of independent student research.The current version of the TIER protocol is the 3.0, and it is freely available 43 .venture-backed, for-profit, educational technology company that offers massive open online courses (MOOCs).Coursera works with universities and other organizations to make some of their classes available online, offering courses in subjects such as physics, engineering, humanities, medicine, biology, social sciences, mathematics, business, computer science, digital marketing, and data science, among others.Coursera offers data science specialization, which includes training on reproducible research 57 .Scholars' blogs Many scholars have been pushing reproducibility forward.An important effort to advance replication is the Political Science Replication blog edited by Nicole Janz (Nottingham University) 58 .Regarding research, Janz (2016) also discusses how reproducibility can be included in current undergraduate and graduate courses.Other similar blog initiatives include Retraction Watch 59 and Simply Statistics 60 .

Table 01 .
Replication, reproducibility and transparency (KING, 2006)g goes as follows: a published paper has already gone through peer evaluation, which means that the reviewers have already considered it publishable once(KING, 2006).A Mansfield, Milner, and Rosendorff (2000)ing the dataset (cross-section, time-series or both); 02.applying new methods to the same data; 03.controlling for new variables; and 04.investigating an entirely new research question with the same data, to name a few possibilities.At first one can argue that replication papers are harder to publish because they lack the sense of novelty that is attractive to publications.However, there are many examples of successful replications that were published in major journals.For instance:Fraga and Hersh (2010)published in the Quarterly Journal of Political Science by replicating data fromGomez et al. (2007);Bell and Miller (2013)published in the Journal of Conflicts Resolution by replicating data fromRauchhaus (2009); andDai (2002)published in the American Political Science Review by reexamining data fromMansfield, Milner, and Rosendorff (2000).Reason 03.Replication materials can lead to better paper reviewsAfter months (sometimes years) of work, it is very disappointing to get a poor paper review.Some referees reject the paper or provide assessments that are too general and thus unlikely to improve our draft.According to the Center for Scientific Review of the National Institutes of Health (NIH), one of the significant roles of the reviewers is to ensure that the scientific foundation of the projectincluding reviewing the data -is sound.The document states that the goal of a ______________________________________________________________________________________________ 11 Information on the TIER Protocol can be found at <http://www.projecttier.org/tier-protocol/>.12See Montenegro and Mesquita (2017) at: <http://www.bpsr.org.br/index.php/bpsr/article/view/304>.Dalson Figueiredo Filho, Rodrigo Lins, Amanda Domingos, Nicole Janz & Lucas Silva (2019) 13 (2) e0001 -11/37 There are others management platforms, such as Taverna 32 , Wings 33 , and Knime 34 .
toolkit, they will have better training to answer relevant scientific questions.
describing the benefits of transparency, it is essential to explain what can be done to foster a more open research culture among political science scholars.Here we emphasize the role of the following incentives: 01.mandatory replication policies for academic journals such as the Brazilian Political Science Review; 02.specific funding for replication studies such as the SSMART grants from BITSS; 03.replication as a required component of the curriculum in undergraduate and graduate courses alike, such as Professor Adriano Codato's syllabus at Parana Federal University; 04.institutional policies to count datasets as (citable) academic outputs, such as Dataverse; 05.conferences and workshops to disseminate transparency, such as BITSS and TIER workshops; 06.development of online platforms such as the Open Science Framework, Harvard Dataverse, the ReplicationWiki (HÖFFLER, 2017) and the Political Science Replication Initiative; 07.creation of journals designed to publish successful and unsuccessful replications; and 08.publication of methodological papers related to replication, reproducibility and transparency.Science is becoming more professional, transparent and reproducible.There is no turning back.Sooner or later, scholars who do not follow the transparency movement are likely to be left behind.Theories that are not falsifiable are destined to die, as well as opaque research practices.With this paper, we hope to foster transparency in the political science scholarly community and thus help it to survive.