When is statistical significance not significant?

Figueiredo Filho, Dalson Britto; Paranhos, Ranulfo; Rocha, Enivaldo C. da; Batista, Mariana; Silva Jr., José Alexandre da; Santos, Manoel L. Wanderley D.; Marino, Jacira Guiro

Abstract

The article provides a non-technical introduction to the p value statistics. Its main purpose is to help researchers make sense of the appropriate role of the p value statistics in empirical political science research. On methodological grounds, we use replication, simulations and observational data to show when statistical significance is not significant. We argue that: (1) scholars must always graphically analyze their data before interpreting the p value; (2) it is pointless to estimate the p value for non-random samples; (3) the p value is highly affected by the sample size, and (4) it is pointless to estimate the p value when dealing with data on population.

p value statistics; statistical significance; significance tests

ANSCOMBE, F. J. (1973), Graphs in Statistical Analysis, The American Statistician, vol. 27, nº 1, pp. 17-21.
BARRO, Robert J., and LEE, Jong-Wha. (2000), International Data on Educational Attainment: Updates and Implications, Center for International Development (CID) -Working Paper nº 42, Harvard University, <http://www.hks.harvard.edu/centers/cid/publications/faculty-working-papers/cid-working-paper-no.-42>
BEGG, Collin B. and BERLIN, Jesse A. (1988), Publication Bias: A Problem in Interpreting Medical Data, Journal of the Royal Statistical Society – Series A, vol. 151, nº 3, pp. 419-463.
CARVER, Ronald P. (1978), The case against statistical significance testing, Harvard Educational Review, vol. 48, nº 3, pp. 378-399.
CARVER, Ronald P. (1993), The Case Against Statistical Significance Testing, Revisited, The Journal of Experimental Education, vol. 61, nº4, pp. 287-292.
COHEN, Jacob. (1988), Statistical Power Analysis for the Behavioral Sciences – 2^nd Edition. Mahwah, NJ: Lawrence Erlbaum Associates.
COURSOUL, Allan and WAGNER, Edwin E. (1986), Effect of Positive Findings on Submission and Acceptance Rates: A Note on Meta-Analysis Bias, Professional Psychology, vol. 17, nº 2, pp. 136-137.
CRAMER, Duncan and HOWITT, Dennis L. (2004), The SAGE Dictionary of Statistics: A Practical Resource for Students in the Social Sciences SAGE Publications Ltd., London.
DANIEL, Larry G. (1998), Statistical significance testing: A historical overview of misuse and misinterpretation with implications for the editorial policies of educational journals, Research in the Schools, vol. 5, nº 2, pp. 23-32.
DAVIDSON, Julia. (2006), Non-probability (non-random) sampling. The Sage Dictonary of Social Research Methods, <http://srmo.sagepub.com/view/the-sage-dictionary-of-social-research-methods/n130.xml>
DE LONG, J. Bradford and LANG, Kevin. (1992), Are All Economic Hypotheses False? Journal of Political Economy, vol. 100, nº 6, pp. 1257-1272.
EVERITT, Brian S. (2006), The Cambridge Dictionary of Statistics – 3^rd edition. New York: Cambridge University Press.
EVERITT, Brian S. and SKRONDAL, Anders (2010), The Cambridge Dictionary of Statistics. New York: Cambridge University Press.
FISHER, Ronald A. (1923), Statistical Tests of Agreement Between Observation and Hipothesys, Economica, nº 8, pp. 139-147.
_____ (1925), Theory of Statistical Estimation, Mathematical Proceedings of the Cambridge Philosophical Society, vol. 22, 700-725.
GELMAN, Andrew, CARLIN, John B., STERN, Hal S. and RUBIN, Donald B. (2003), Bayesian Data Analysis – 2^nd edition. New York: Chapman and Hall/CRC Texts in Statistical Science.
GELMAN, Andrew and STERN, Hal. (2006), The Difference Between "Significant" and "Not Significant" is not Itself Statistically Significant, The American Statistician, vol. 60, nº 4, pp. 328-331.
GELMAN, Andrew (2007), Bayesian statistics Basel Statistical Society, Switzerland.
GELMAN, Andrew and WEAKLIEM, David. (2009), Of Beauty, Sex and Power, American Scientist, vol. 97, pp. 310-317.
GELMAN, Andrew. (2012a), The inevitable problems with statistical significance and 95% intervals, Statistical Modeling, Causal Inference, and Social Science, < http://andrewgelman.com/2012/02/02/the-inevitable-problems-with-statistical-significance-and-95-intervals/>
GELMAN, Andrew. (2012b), What do statistical p-values mean when the sample = the population?, Statistical Modeling, Causal Inference, and Social Science, <http://andrewgelman.com/2012/09/what-do-statistical-p-values-mean-when-the-sample-the-population/>
GERBER, Alan, GREEN, Donald P. and NICKERSON, David. (2001), Testing for Publication Bias in Political Science, Political Analysis, vol. 9, nº 4, pp. 385-392.
GILL, Jeff. (1999), The Insignificance of Null Hypothesis Significance Testing, Political Research Quarterly, vol. 52, nº 3, pp. 647-674.
______ (2007), Bayesian Methods: A Social and Behavioral Sciences Approach – 2^nd edition. New York: Chapman and Hall/CRC Statistics in the Social and Behavioral Sciences.
GREENWALD, Anthony G. (1975), Consequences of Prejudice Against the Null Hypothesis, Psychological Bulletin, vol. 82, nº 1, pp. 1-12.
HAIR, Joseph F., BLACK, William C., BABIN, Barry J., ANDERSON, Rohph E. and TATHAM, Ronald L. (2006), Multivariate Data Analysis – 6ª edition. Upper Saddle River, NJ: Pearson Prentice Hall.
HENKEL, Ramon E. (1976), Tests of significance. Newbury Park, CA: Sage.
HUBERTY, Carl J. (1993), Historical origins of statistical testing practices: The treatment of Fisher versus Neyman-Pearson views in textbooks, The Journal of Experimental Education, vol. 61, nº 4, pp. 317-333.
JORDAN, Michael I. (2009), Bayesian or Frequentist, Which are You?, Department of Electrical Engineering and Computer Sciences, University of California - Berkeley, Videolectures.net, <http://videolectures.net/mlss09uk_jordan_bfway/>
KING, Gary, KEOHANE, Robert and VERBA, Sidney. (1994), Designing Social Inquiry: Scientific Inference in Qualitative Research. Princeton. N.J.: Princeton University Press.
LUSKIN, Robert C. (1991), Abusus Non Tollit Usum: Standardized Coefficients, Correlations, and R²s, American Journal of Political Science, vol. 35, nº 4, pp. 1032-1046.
MAHONEY, Michael J. (1977), Publication Prejudices: An Experimental Study of Confirmatory Bias in the Peer Review System, Cognitive Therapy Research, vol. 1, nº 2, pp. 161–175.
McLEAN, James E., and ERNEST, James M. (1998), The Role of Statistical Significance Testing in Educational Research, Research in the Schools, vol. 5, nº 2, pp. 15-22.
MOORE, David S. and McCABE, George P. (2006), Introduction to the Practice of Statistics – 5^th edition. New York: Freeman.
ROGERS, Tom (n.d), Type I and Type II Errors – Making Mistakes in the Justice System, Amazing Applications of Probability and Statistics, <http://www.intuitor.com/statistics/T1T2Errors.html>
SAWILOWSKY, Shlomo. (2003), Deconstructing Arguments From The Case Against Hypothesis Testing, Journal of Modern Applied Statistical Methods, vol. 2, nº 2, pp. 467-474.
SCARGLE, Jeffrey D. (2000), Publication Bias: The "File-Drawer Problem" in Scientific Inference, The Journal of Scientific Exploration, vol. 14, nº 1, pp. 91-106.
SIGELMAN, Lee. (1999), Publication Bias Reconsidered, Political Analysis, vol. 8, nº 2, pp. 201-210.
SIMES, John R. (1986), Publication Bias: The Case for an International Registry of Clinical Trials, Journal of Clinical Oncology, vol. 4, nº 10, pp. 1529-1541.
SHAVER, J. (1992), What significance testing is, and what it isn't Paper presented at the Annual Meeting of the American Educational Research Association, San Francisco, CA.
SMITH, T. M. F. (1983), On the validity of inferences from Non-random Samples, Journal of the Royal Statistical Society – Series A (General), vol. 146, nº 4, pp. 394–403.
THE COCHRANE COLLABORATION. (n.d), What is publication bias? The Cochrane Collaboration open learning material, <http://www.cochrane-net.org/openlearning/html/mod15-2.htm>
VAN EVERA, Stephen. (1997), Guide to Methods for Students of Political Science Ithaca, NY: Cornell University Press.
YOUTUBE. (2010), What the p-value?, <http://www.youtube.com/watch?v=ax0tDcFkPic&feature=related>

Publication Dates

Publication in this collection
20 Aug 2013
Date of issue
2013

History

Received
Aug 2012
Accepted
Apr 2013

This work is licensed under a Creative Commons Attribution 4.0 International License.

[1] ANSCOMBE, F. J. (1973), Graphs in Statistical Analysis, The American Statistician, vol. 27, nº 1, pp. 17-21.

[2] BARRO, Robert J., and LEE, Jong-Wha. (2000), International Data on Educational Attainment: Updates and Implications, Center for International Development (CID) -Working Paper nº 42, Harvard University, <http://www.hks.harvard.edu/centers/cid/publications/faculty-working-papers/cid-working-paper-no.-42>

[3] BEGG, Collin B. and BERLIN, Jesse A. (1988), Publication Bias: A Problem in Interpreting Medical Data, Journal of the Royal Statistical Society – Series A, vol. 151, nº 3, pp. 419-463.

[4] CARVER, Ronald P. (1978), The case against statistical significance testing, Harvard Educational Review, vol. 48, nº 3, pp. 378-399.

[5] CARVER, Ronald P. (1993), The Case Against Statistical Significance Testing, Revisited, The Journal of Experimental Education, vol. 61, nº4, pp. 287-292.

[6] COHEN, Jacob. (1988), Statistical Power Analysis for the Behavioral Sciences – 2^nd Edition. Mahwah, NJ: Lawrence Erlbaum Associates.

[7] COURSOUL, Allan and WAGNER, Edwin E. (1986), Effect of Positive Findings on Submission and Acceptance Rates: A Note on Meta-Analysis Bias, Professional Psychology, vol. 17, nº 2, pp. 136-137.

[8] CRAMER, Duncan and HOWITT, Dennis L. (2004), The SAGE Dictionary of Statistics: A Practical Resource for Students in the Social Sciences SAGE Publications Ltd., London.

[9] DANIEL, Larry G. (1998), Statistical significance testing: A historical overview of misuse and misinterpretation with implications for the editorial policies of educational journals, Research in the Schools, vol. 5, nº 2, pp. 23-32.

[10] DAVIDSON, Julia. (2006), Non-probability (non-random) sampling. The Sage Dictonary of Social Research Methods, <http://srmo.sagepub.com/view/the-sage-dictionary-of-social-research-methods/n130.xml>

[11] DE LONG, J. Bradford and LANG, Kevin. (1992), Are All Economic Hypotheses False? Journal of Political Economy, vol. 100, nº 6, pp. 1257-1272.

[12] EVERITT, Brian S. (2006), The Cambridge Dictionary of Statistics – 3^rd edition. New York: Cambridge University Press.

[13] EVERITT, Brian S. and SKRONDAL, Anders (2010), The Cambridge Dictionary of Statistics. New York: Cambridge University Press.

[14] FISHER, Ronald A. (1923), Statistical Tests of Agreement Between Observation and Hipothesys, Economica, nº 8, pp. 139-147.

[15] _____ (1925), Theory of Statistical Estimation, Mathematical Proceedings of the Cambridge Philosophical Society, vol. 22, 700-725.

[16] GELMAN, Andrew, CARLIN, John B., STERN, Hal S. and RUBIN, Donald B. (2003), Bayesian Data Analysis – 2^nd edition. New York: Chapman and Hall/CRC Texts in Statistical Science.

[17] GELMAN, Andrew and STERN, Hal. (2006), The Difference Between "Significant" and "Not Significant" is not Itself Statistically Significant, The American Statistician, vol. 60, nº 4, pp. 328-331.

[18] GELMAN, Andrew (2007), Bayesian statistics Basel Statistical Society, Switzerland.

[19] GELMAN, Andrew and WEAKLIEM, David. (2009), Of Beauty, Sex and Power, American Scientist, vol. 97, pp. 310-317.

[20] GELMAN, Andrew. (2012a), The inevitable problems with statistical significance and 95% intervals, Statistical Modeling, Causal Inference, and Social Science, < http://andrewgelman.com/2012/02/02/the-inevitable-problems-with-statistical-significance-and-95-intervals/>

[21] GELMAN, Andrew. (2012b), What do statistical p-values mean when the sample = the population?, Statistical Modeling, Causal Inference, and Social Science, <http://andrewgelman.com/2012/09/what-do-statistical-p-values-mean-when-the-sample-the-population/>

[22] GERBER, Alan, GREEN, Donald P. and NICKERSON, David. (2001), Testing for Publication Bias in Political Science, Political Analysis, vol. 9, nº 4, pp. 385-392.

[23] GILL, Jeff. (1999), The Insignificance of Null Hypothesis Significance Testing, Political Research Quarterly, vol. 52, nº 3, pp. 647-674.

[24] ______ (2007), Bayesian Methods: A Social and Behavioral Sciences Approach – 2^nd edition. New York: Chapman and Hall/CRC Statistics in the Social and Behavioral Sciences.

[25] GREENWALD, Anthony G. (1975), Consequences of Prejudice Against the Null Hypothesis, Psychological Bulletin, vol. 82, nº 1, pp. 1-12.

[26] HAIR, Joseph F., BLACK, William C., BABIN, Barry J., ANDERSON, Rohph E. and TATHAM, Ronald L. (2006), Multivariate Data Analysis – 6ª edition. Upper Saddle River, NJ: Pearson Prentice Hall.

[27] HENKEL, Ramon E. (1976), Tests of significance. Newbury Park, CA: Sage.

[28] HUBERTY, Carl J. (1993), Historical origins of statistical testing practices: The treatment of Fisher versus Neyman-Pearson views in textbooks, The Journal of Experimental Education, vol. 61, nº 4, pp. 317-333.

[29] JORDAN, Michael I. (2009), Bayesian or Frequentist, Which are You?, Department of Electrical Engineering and Computer Sciences, University of California - Berkeley, Videolectures.net, <http://videolectures.net/mlss09uk_jordan_bfway/>

[30] KING, Gary, KEOHANE, Robert and VERBA, Sidney. (1994), Designing Social Inquiry: Scientific Inference in Qualitative Research. Princeton. N.J.: Princeton University Press.

[31] LUSKIN, Robert C. (1991), Abusus Non Tollit Usum: Standardized Coefficients, Correlations, and R²s, American Journal of Political Science, vol. 35, nº 4, pp. 1032-1046.

[32] MAHONEY, Michael J. (1977), Publication Prejudices: An Experimental Study of Confirmatory Bias in the Peer Review System, Cognitive Therapy Research, vol. 1, nº 2, pp. 161–175.

[33] McLEAN, James E., and ERNEST, James M. (1998), The Role of Statistical Significance Testing in Educational Research, Research in the Schools, vol. 5, nº 2, pp. 15-22.

[34] MOORE, David S. and McCABE, George P. (2006), Introduction to the Practice of Statistics – 5^th edition. New York: Freeman.

[35] ROGERS, Tom (n.d), Type I and Type II Errors – Making Mistakes in the Justice System, Amazing Applications of Probability and Statistics, <http://www.intuitor.com/statistics/T1T2Errors.html>

[36] SAWILOWSKY, Shlomo. (2003), Deconstructing Arguments From The Case Against Hypothesis Testing, Journal of Modern Applied Statistical Methods, vol. 2, nº 2, pp. 467-474.

[37] SCARGLE, Jeffrey D. (2000), Publication Bias: The "File-Drawer Problem" in Scientific Inference, The Journal of Scientific Exploration, vol. 14, nº 1, pp. 91-106.

[38] SIGELMAN, Lee. (1999), Publication Bias Reconsidered, Political Analysis, vol. 8, nº 2, pp. 201-210.

[39] SIMES, John R. (1986), Publication Bias: The Case for an International Registry of Clinical Trials, Journal of Clinical Oncology, vol. 4, nº 10, pp. 1529-1541.

[40] SHAVER, J. (1992), What significance testing is, and what it isn't Paper presented at the Annual Meeting of the American Educational Research Association, San Francisco, CA.

[41] SMITH, T. M. F. (1983), On the validity of inferences from Non-random Samples, Journal of the Royal Statistical Society – Series A (General), vol. 146, nº 4, pp. 394–403.

[42] THE COCHRANE COLLABORATION. (n.d), What is publication bias? The Cochrane Collaboration open learning material, <http://www.cochrane-net.org/openlearning/html/mod15-2.htm>

[43] VAN EVERA, Stephen. (1997), Guide to Methods for Students of Political Science Ithaca, NY: Cornell University Press.

[44] YOUTUBE. (2010), What the p-value?, <http://www.youtube.com/watch?v=ax0tDcFkPic&feature=related>

Brasil

Brasil

When is statistical significance not significant?

Abstract

Publication Dates

History