Acessibilidade / Reportar erro

A Mapping Study of Scientific Merit of Papers, which Subject are Web Applications Test Techniques, Considering their Validity Threats

Abstract

Progress in software engineering requires (1) more empirical studies of quality, (2) increased focus on synthesizing evidence, (3) more theories to be built and tested, and (4) the validity of the experiment is directly related with the level of confidence in the process of experimental investigation. This paper presents the results of a qualitative and quantitative classification of the threats to the validity of software engineering experiments comprising a total of 92 articles published in the period 2001-2015, dealing with software testing of Web applications. Our results show that 29.4% of the analyzed articles do not mention any threats to validity, 44.2% do it briefly, and 14% do it judiciously; that leaves a question: these studies have scientific value?

Keywords
Applications; Conclusion; Construction; External; Internal; Literature; Mapping; Qualitative; Quantitative; Reviews; Study; Systematic; Tests; Threats; Validity; Web

INTRODUCTION

According to Moran (1995MORAN, JOSÉ MANUEL (1995). Novas tecnologias e o reencantamento do mundo. Tecnologia educacional 23.126: 24-26.), the fear of a nuclear catastrophe and the incapacity to survive it, prompted scientists to create a non-hierarchical access structure that was implemented in universities, a network designed for military use, the Internet. Since then the network has been used for all kinds of research and business.

Mendes (2006MENDES, RENATO (2006): Quando a empresa não sabe o que fazer do seu site. http://www.milajuns.com.br/quando-a-empresa-nao-sabe-o-que-fazer-de-seu-site/. Accessed 25-Oct-2014.
http://www.milajuns.com.br/quando-a-empr...
) reports that the importance of the Internet for businesses is incontestable. Companies wishing to expand their business must be on the Web, and the Web site is the digital interface of the organization.

According to McGraw (2006McGRAW, G. (2006). Software Security - Building Security In. São Paulo, Addison-Wesley.) vulnerabilities found in Web applications place applications, and consequently, companies that use them, at risk and can lead to leakage of inside information, which may result in impairment of the business model and cause great economic impact.

A 2003 study conducted by the Business Internet Group San Francisco (BIG-SF) (2003BUSINESS INTERNET GROUP SAN FRANCISCO (BIG-SF) (2003): Black Friday Report on Web Application Integrity, The Business Internet Group of San Francisco, BUSINESS INTERNET GROUP SAN FRANCISCO (BIG-SF) (2003): Black Friday Report on Web Application Integrity, The Business Internet Group of San Francisco, http://www.tealeaf.com/news/press_releases/2003/0203.asp . Accessed 17-Oct-2014.
http://www.tealeaf.com/news/press_releas...
) reported that about 70% of the sites and Web applications contain defects. In addition to financial costs, defects in web applications result in loss of credibility.

As a solution to the question of flaws in Web applications, studies indicate that we should increase the amount of testing in order to improve its quality. This is a difficult task, as pointed out by Dogan et al. (2014DOĞAN, SERDAR, AYSU BETIN-CAN, and VAHID GAROUSI (2014). Web application testing: A systematic literature review. Journal of Systems and Software 91: 174-201.), since the heterogeneity of Web applications make them prone to errors and difficult to test. Its heterogeneity comes from the different languages ​​used for developing applications that must work together and in harmony, and its client/server architecture with (asynchronous) HTTP request/response calls to synchronize the application state.

Li et al. (2014LI, YUAN-FANG, PARAMJIT K. DAS, and DAVID L. DOWE (2014). Two decades of Web application testing - A survey of recent advances. Information Systems 43: 20-54.) shows that software testing has been widely used in order to ensure the technical quality of the various artifacts of a software project. In fact, software testing is a critical part of all high quality software production process and manufacturers devote approximately 40% of their time on testing to ensure the quality of the produced application.

However, contrary to the preconceived notion that software testing is used to demonstrate the absence of errors, the test is the process of finding as many errors as possible in order to improve software quality. This is due to the fact that demonstrate the absence of errors in the software requires testing all possible permutations for a given set of inputs which is obviously, for any non-trivial system, technologically and economically infeasible.

The quality assurance of Web applications, based on software testing, demands a large research effort in order to find efficient methods in terms of cost for different aspects of testing (Engström, 2013ENGSTRÖM, EMELIE (2013). Supporting Decisions on Regression Test Scoping in a Software Product Line Context-from Evidence to Practice. Dissertation. Lund University.). Examples include the selection of test cases based on code and specification changes, the evaluation of selection techniques, the change impact analysis and regression testing for different applications. But, the application of methods and effective techniques in software testing requires the execution of empirical studies and that these studies report valid results.

The focus of this article is not on software testing, but the threats to the validity of the empirical studies used to establish the processes and methods created to implement such tests. The goal of this research is to map the threats (quantitative and qualitative) to the validity of empirical studies based on the relevant types of Web testing environment, over the period 2001-2015, and find out if these papers have the necessary empirical rigor to have scientific value.

Section 1 describes and classifies threats to validity of empirical studies and theoretical basis which exists behind the systematic literature reviews. Section 2, describes our research method. Section 3 shows the research questions. Section 4 is devoted to threats to validity that our study is subject. Finally, the conclusions show the qualitative and quantitative results of this systematic literature mapping.

CONCEPTUAL FRAMEWORK

Man prefers to believe what

he prefers to be true. (Francis Bacon)

Empirical Studies in Software Engineering

According to Zelkowitz et al. (1998ZELKOWITZ, MARVIN V., WALLACE, DOLORES R. (1998). Experimental models for validating computer technology. IEEE Computer 31.5: 23-31.), software engineering experiments do not have yet the maturity observed in other fields of science, because many papers do not follow a specific methodology and do not use appropriate validation techniques.

Experimental Software Engineering (ESE) aims to leverage the scientific method in this area. According to Boehm et al. (2005BOEHM, BARRY; ROMBACH, HANS DIETER; ZELKOWITZ, MARVIN V. (2005). Foundations of Empirical Software Engineering: The Legacy of Victor R. Basili. 83. Germany, Springer-Verlag.), is attributed to Victor R. Basili the first experiments to increase the rigor of research in software engineering and the creation of tools to enhance the quality of work, generating reliable evidence.

According to Miller (2000MILLER, JAMES (2000). Applying meta-analytical procedures to software engineering experiments. Journal of Systems and Software, 54.1: 29-39.), the reliability of the evidence depends on the type of study carried out and also the quality with which their findings are replicated, because hardly the results of a single experiment are reliable. For a reliable conclusion to be achieved, the combination of the results of several experiments, with common variables, linked to the same hypothesis is required (replication). Among the reasons that bring into question the reliability of the evidence of the studies are: the flaws in the formulation of hypotheses, the error in study design and the error in the execution of the study.

An experimental study is an action carried out with the purpose of revealing something unknown or to prove a hypothesis. The researcher defines the data collection procedure, the data analysis technique and the meaning of the results (Basili et al., 1999BASILI, V. R., SHULL, F., and LANUBILE, F. (1999). Building knowledge through families of experiments. IEEE Transactions of Software Engineering, 25.4: 456-474.). All experimental study should describe its goals, its method, results and conclusions. The research method establishes the accuracy of techniques and procedures adopted and the validity of evidence obtained from the study. The results, other than themselves, discuss threats to validity, limitations of the method used and the study variables. The conclusion confirms if the hypothesis described in the objective is confirmed or rejected and points future implications.

The main characteristics of an experimental study are: hypothesis, population, treatment, applicator and variables. Hypothesis - is the proposition, theory or assumption which can explain a certain behavior, regardless of the fact that it is confirmed or rejected and is free of human intention. Population - the concept of population in accordance with the findings of Juristo et al. (2001JURISTO, NATALIA; MORENO, ANA M. (2001). Basics of Software Engineering Experimentation. Kluwer Academic Publishers.), for software engineering, it is an experimental study which is divided into people, products, problems and processes.

Usually empirical studies are made on a sample of the population (units). A single experiment is constituted of applying a combination of operations / treatments to a unit of study. Treatment is the action to be applied to an object of experimental study. It is specified as a set of controlled variables according to the experimental design (factors). To this end, it is observed levels on the variation. These levels are the possible values of a factor during the study. Treatment defines the factors and levels to be applied to an element of the sample population. Variables - controlled features and measures in an experimental study.

Threats to the validity of empirical studies

Perry et al. (2000PERRY, DEWAYNE E., ADAM A. PORTER, and LAWRENCE G. VOTTA (2000). Empirical studies of software engineering: a roadmap. Proceedings of the conference on The future of Software engineering . ACM.) defines threats to the validity of an empirical study as the influences that may limit our ability to interpret and draw conclusions from the study data.

According Wohlin et al. (2012WOHLIN, C., RUNESON, P., HÖST, M., OHLSSON, M. C., REGNELL, B., and WESSLÉN, A. (2012). Experimentation in software engineering. Berlin, Springer-Verlag Berlin Heidelberg.), a fundamental question regarding the results of an experiment is how valid they are. The validity of results should be taken into account in the study design phase, because this issue can greatly influence the validity of the result of the study. The samples to be used in the study are drawn from a population of interest, and that its outcome is appropriate they have to be valid for this population. Subsequently, the study results can be generalized to a broader population.

According to Campbell and Stanley (1963CAMPBELL, D.T., STANLEY, J.C. (1963). Experimental and Quasi-experimental Designs for Research. Boston, Houghton Mifflin Company.) and Cook and Campbell (1979COOK, T.D., CAMPBELL, D.T. (1979). Quasi-experimentation - Design and Analysis Issues for Field Settings. Boston, Houghton Mifflin Company .), there are four types of threats to the validity of an experiment:

  1. Conclusion Validity: Ability to draw inaccurate conclusions from observations.

  2. Internal Validity: Threats that may have affected the results and have not been properly taken into account.

  3. Construction Validity: Threats on the relationship between theory and observation.

  4. External validity: Threats that affect the generalizability of the results.

Threats to the validity of the conclusion concern the ability to take the correct conclusion on the relationship between treatment and outcome of an experiment. The validity of conclusion is subdivided into low statistical power, violated assumptions of the statistical tests, fishing and error rate, reliability of measures, reliability of the implementation of the treatment, random irrelevancies in the experimental setup and random heterogeneity issues.

The internal validity is established as the causal influences that may affect the independent variable without the knowledge of the researcher, so threaten the conclusion drawn from the experiment. Internal validity is subdivided into single group of threats, threats of multiple groups and social threats. Single group of threats: history, maturation, testing, instrumentation, statistical regression, selection, mortality, ambiguity about the direction of causal influence. Threats of multiple groups: interaction with the selection. Social threats: diffusion or imitation of treatments, compensatory equalization of treatment, compensatory rivalry, and resentment demoralized.

Construct validity concerns the generalization of the results of the experiment with the concept or theory behind the experiment. Some threats refer to the experimental design, others to social factors. The threats of design to construct validity cover issues that are related to the experience design and its ability to reflect the construction being studied. The behavior of those involved and experimenters can change based on the fact of being part of an experiment, which gives false results in experience.

Finishing forms of threats defined by Campbell and Stanley (1963CAMPBELL, D.T., STANLEY, J.C. (1963). Experimental and Quasi-experimental Designs for Research. Boston, Houghton Mifflin Company.) and Cook and Campbell (1979COOK, T.D., CAMPBELL, D.T. (1979). Quasi-experimentation - Design and Analysis Issues for Field Settings. Boston, Houghton Mifflin Company .), we have threats to external validity. They are conditions that limit our ability to generalize the results of our experience with industrial practice. There are three types of interactions with treatment: people, place and time.

Systematic literature reviews

Kitchenham, et al. (2010KITCHENHAM, BARBARA A.; DYBA, TORE; JORGENSEN, MAGNE (2004). Evidence-based software engineering. Internacional Conference on Software Engineering (ICSE’ 04 ). Edinburgh, UK : [s.n.], 2004 . p. 273-281.) argues that empirical researchers in software engineering should develop its studies as evidence-based, as do researchers in medicine and sociology, and that the most reliable evidence comes from the aggregation of empirical studies on a given topic.

Systematic literature reviews are a means of adding knowledge about software engineering topic or research question and aim to be the most impartial possible to be auditable and repeatable.

The systematic literature reviews are classified into conventional and mapping studies. Conventional studies aggregate results related to a particular research question, for example, is technique testing "A" more effective in defect detection that the technique "B"? If there are enough primary studies comparable with quantitative estimates of the difference between the methods, meta-analysis can be used to perform an aggregation based on formal statistics. However, it was found that meta-analysis is rarely usable in systematic literature reviews of software engineering, because there are often insufficient primary studies.

Mapping studies are designed to find and classify primary studies on a specific subject area. They have coarse-grained research questions such as: "What we know about topic x?" and can be used to identify the available literature before performing conventional systematic literature reviews. They use the same methods for searching and extracting data as conventional systematic literature reviews, but most rely on the tabulation of primary studies in specific categories. In addition, some mapping studies are more concerned about how academics conduct research in software engineering rather than what is known about a given topic software engineering. The study reported in this paper is a mapping study.

RESEARCH METHOD

The revised articles were based on primary evidence of three empirical studies (Engström et al., 2010ENGSTRÖM, EMELIE, and PER, RUNESON (2011). Software product line testing-a systematic mapping study. Information and Software Technology 53.1: 2-13.; Engström et al., 2011 ENGSTRÖM, EMELIE, and PER, RUNESON (2011). Software product line testing-a systematic mapping study. Information and Software Technology 53.1: 2-13.and Doğan, 2014DOĞAN, SERDAR, AYSU BETIN-CAN, and VAHID GAROUSI (2014). Web application testing: A systematic literature review. Journal of Systems and Software 91: 174-201.). To increase our research universe an automated search was conducted in the period from 2001 to 2015, to find jobs that contained empirical studies and made reference to types of tests applicable to Web environment and the threats to validity related to these studies.

Search strategy

Search strings

To have access to the articles (references) studies based taken in studies (Engström et al., 2010ENGSTRÖM, EMELIE, and PER, RUNESON (2011). Software product line testing-a systematic mapping study. Information and Software Technology 53.1: 2-13.; Engström et al., 2011 ENGSTRÖM, EMELIE, and PER, RUNESON (2011). Software product line testing-a systematic mapping study. Information and Software Technology 53.1: 2-13.and Doğan, 2014DOĞAN, SERDAR, AYSU BETIN-CAN, and VAHID GAROUSI (2014). Web application testing: A systematic literature review. Journal of Systems and Software 91: 174-201.) for this study and to the automated search, in which the string was used: "web application testing" OR "web applications tests" OR "test of web applications" OR "testing web applications" OR "tests for web applications" AND "validity threats" OR "threats to validity".

3.1.2. Search engines

http://ieeexplore.ieee.org

http://dl.acm.org

http://scholar.google.com

http://academic.research.microsoft.com

http://citeseerx.ist.psu.edu

http://www.sciencedirect.com

Selection strategy

In the first step, the papers studies (Engström et al., 2010ENGSTRÖM, EMELIE, PER RUNESON, and MATS SKOGLUND (2010). A systematic review on regression test selection techniques. Information and Software Technology 52.1: 14-30.; Engström et al., 2011 ENGSTRÖM, EMELIE, and PER, RUNESON (2011). Software product line testing-a systematic mapping study. Information and Software Technology 53.1: 2-13.and Doğan, 2014DOĞAN, SERDAR, AYSU BETIN-CAN, and VAHID GAROUSI (2014). Web application testing: A systematic literature review. Journal of Systems and Software 91: 174-201.) formed the basis to the development of this study. All the references of these articles were used in the search engines mentioned in item 3.1.2, and we downloaded only files that were publicly available. Then, based on the title and summary, we selected those of various topics (since creation of algorithms to test methodologies), linked to Web applications testing, so we had a larger universe and more varied studies, resulting in 42 articles.

In a second step, the search strings mentioned in item 3.1.1 were applied to the same search engines, and adopting the procedure cited above to select new 50 articles. Adding the quantity of articles selected from the basis papers and the articles selected from the search strings, we reach a total of 92 papers for this study.

Figure 1
Strategy of search and selection of papers.

RESEARCH QUESTIONS

How many empirical studies related to Web application testing make reference to threats to validity of the same?

The results showed that of the 92 articles studied, 65 (E1 to E29 and E60 to E95) make reference to threats to validity (70.6%), whereas 16 articles (E3, E60, E62, E66, E69, E70, E71, E73, E74, E77, E78, E79, E85, E87, E89 and E93), 17.4% of the articles cite threats, but not classify them and 27 (E30, E32, E33, E34, E35, E36, E37, E38, E39, E40, E41, E42, E43, E44, E46, E48, E49, E50, E51, E52, E53, E54, E55, E56, E57, E58, E59) do not mention their threats to validity (29.3%).

For studies that mention the threats to validity, which are the most frequently cited threats?

According to the survey, the order (descending) in which threats to validity are quoted is: External Threat, Internal Threat, Threats of Construction, and Conclusion Threat. Table 1 presents threats as named in the articles. The complete tables listing the articles are available in Annex 1.

Table 1
Frequency of threats cited in the articles by type.

How many studies mention threats to validity, but not classify (external, internal, construction and conclusion)?

A total of 16 papers were found: E3, E60, E62, E66, E69, E70, E71, E73, E74, E77, E78, E79, E85, E87, E89 and E93, all having an item on threats to validity, but not classified into external, internal, construction or conclusion.

How many studies mention the threats to the validity in a resumed form and how many do it thoroughly?

In the survey, the number of studies mentioning threats to validity judiciously is significantly less than that in a summary form. Table 2 shows the amount of threats described briefly and judiciously.

Table 2
Number of threats described briefly and judiciously.

What is the yearly relative frequency of articles making reference to threats to validity?

According to the study, the year 2014 was the year that presented more articles mentioning its threats to validity, a total of 29 articles. Table 3 presents the articles that refer to threats to validity, organized by year of publication.

According to the study, the year 2010 was the year that had more articles that do not cite its threats to validity, for a total of 5 items. Table 4 presents the articles that make no reference to threats to validity, organized by year of publication.

Table 3
Papers that do and do not reference threats to validity, organized by year of publication.

Which types of research are used?

According to our study, there are five types of research found in the group studied, listed below in descending order of quantity: Empirical Study, Systematic Literature Review, Mapping Study, Survey Study and Qualitative Research. Table 4 displays the amount of papers by type of search.

Table 4
Number of papers by type of research.

Internal validity

Research questions

The set of defined questions may not have covered the whole area of threats to validity in Web application testing. As we consider this as a viable threat, we discussed the issue in depth, in order to calibrate the questions. Thus, even if we have not selected the best set of questions, we seek to deeply address the most frequently asked and considered questions in the field.

How many studies of each type take into account the threat validity type?

Table 5 summarizes the relation between threat types and types of research.

Table 5
Threats to validity by type of research.

THREATS TO THE VALIDITY OF THIS STUDY

Conclusion validity

In this article we focus on threats to validity to empirical studies in Web application testing. We found that 32% of the studies make no reference to any threats to validity. Also, in the studies that do it, (another 68%) only 14% of these do it judiciously. The large disparity between the two sets leaves no room to unrealistic inferences about the validity threats to the conclusion of our study.

Selected articles

It is possible that some relevant studies have not been chosen during the research process. We mitigated this threat, as far as possible, following the references of primary studies and also we use the search string to have a larger universe of research.

Validity of Construction

One of the detailing of the validity of construction is the threat that experimenters can influence the results of a study, consciously and unconsciously, based on what they expect from the experience.

In this study we do not see the threat to construct validity as a real threat, because when we started the study, we did not have preconceived ideas of results. We simply choose the items in a comprehensive manner and took the numbers found in the contents thereof. In fact, we were surprised with the result.

External validity

Based on what was said by Wohlin et al. (2012WOHLIN, C., RUNESON, P., HÖST, M., OHLSSON, M. C., REGNELL, B., and WESSLÉN, A. (2012). Experimentation in software engineering. Berlin, Springer-Verlag Berlin Heidelberg.) if an experiment is carried out in a limited scope and its design is done to address issues related to the scope concerned, its results are valid in that context and in that case, adequate validity does not mean general validity, but if general conclusions will be taken from the experiment, its validity will have to cover a broader context, we conclude that under the 92 studies used, our results are valid, but if we consider the whole universe of articles relating to tests of Web applications, the selected amount of articles is reduced compared to the size of the universe. If we have chosen a larger number of studies maybe the conclusions drawn were different. So we think there is a real threat to external validity of this study because we cannot generalize it to the whole universe articles.

FINAL REMARKS

According to Sjøberg et al. (2007SJØBERG, D. I., DYBÅ, T., and JØRGENSEN, M. (2007). The Future of Empirical Methods in Software Engineering Research. In 2007 Future of Software Engineering (May 23 - 25, 2007 ). International Conference on Software Engineering . IEEE Computer Society, Washington, DC, 358-378.), progress in software engineering requires (1) more empirical studies of quality, (2) increased focus on synthesizing evidence and (3) more theories should be built and tested. Therefore, we need more studies in which we can trust, especially regarding their validity. In order to obtain this confidence, the collected information should be more focused on observation and experimentation than in deductive logic or mathematics. For this, we must use the same empirical methods used by the sciences who study real-world phenomena - empirical sciences.

According to Wainer (2007WAINER, J. (2007): Métodos de pesquisa quantitativa e qualitativa para a Ciência da Computação. http://www.pucrs.br/famat/viali/mestrado/mqp/material/textos/Pesquisa.pdf. Accessed 01-Nov-2014.
http://www.pucrs.br/famat/viali/mestrado...
) and Travassos et al. (2002TRAVASSOS, G. H., GUROV, D. ; AMARAL, E. A. G. (2002): Introdução à Engenharia de Software Experimental. COPPE/UFRJ, Rio de Janeiro, Relatório técnico: RT-ES-590/02, http://cultura.ufpa.br/cdesouza/teaching/methods/6-ES-Experimental.pdf. Accessed 12-Dec-2014.
http://cultura.ufpa.br/cdesouza/teaching...
), for an experiment to be considered valid, it must provide a high level of confidence in the process of experimental investigation. This level of trust must permeate all elements involved in the process from the theoretical basis adopted, up to the final results. It is also important that the experiment takes into account all the threats to its validity: external, internal, construction and conclusion.

In order to have a wide range of studies that enable us to analyze more accurately the validity threats of software engineering experiments in Web application testing, we selected a number of articles with subjects ranging from algorithms to methodologies for web testing developed for such purposes.

The analysis of this wide range of papers shows that approximately 32% of the articles do not mention any threats to validity. On the other hand, in the remaining 68%, only 14% do it in a judiciously manner. In the set of the papers that mention validity threats (68%) one quarter do not provide validity classification. In the last five years we have had an increase in quantity of articles mentioning his threats to validity, and the year 2014 was the leader with 29 articles. The validity threats most frequently cited are respectively: external, internal, construction and conclusion.

Our research also suffers validity threats. The first one, conclusion validity is not critical since, as a mapping study, we restricted ourselves to find the relative frequency of the papers that mention threats to validity (32%) against the ones that do not mention any threats (68%). We consider that there is a small risk that our study suffers Internal validity threats because we addressed the most frequently asked questions in this field. The selection of the articles, started with the papers referenced in the primary studies used followed by a search string to top up our universe of research. We do not think that the construct validity of our systematic mapping study is a real threat, because we had no preconceived ideas about the findings. The study consisted simply in collecting and counting information. Finally, we believe that external validity is the greatest threat to our study. Although we trust the results found in the 92 papers analyzed, we think that there is non-negligible chance of missing some major studies conducted where the analysis of all validity threats had been properly discussed.

Reiterating our previous observations, although this study is not totally immune to validity threats, the calculated figures lead us to the conclusion that the studies published in the time frame analyzed, lack scientific rigor both from the authors and the readers, and 73.6% of the studies have no scientific value.

Bibliographic references

  • BASILI, V. R., SHULL, F., and LANUBILE, F. (1999). Building knowledge through families of experiments. IEEE Transactions of Software Engineering, 25.4: 456-474.
  • BOEHM, BARRY; ROMBACH, HANS DIETER; ZELKOWITZ, MARVIN V. (2005). Foundations of Empirical Software Engineering: The Legacy of Victor R. Basili 83. Germany, Springer-Verlag.
  • BUSINESS INTERNET GROUP SAN FRANCISCO (BIG-SF) (2003): Black Friday Report on Web Application Integrity, The Business Internet Group of San Francisco, BUSINESS INTERNET GROUP SAN FRANCISCO (BIG-SF) (2003): Black Friday Report on Web Application Integrity, The Business Internet Group of San Francisco, http://www.tealeaf.com/news/press_releases/2003/0203.asp Accessed 17-Oct-2014.
    » http://www.tealeaf.com/news/press_releases/2003/0203.asp
  • CAMPBELL, D.T., STANLEY, J.C. (1963). Experimental and Quasi-experimental Designs for Research Boston, Houghton Mifflin Company.
  • COOK, T.D., CAMPBELL, D.T. (1979). Quasi-experimentation - Design and Analysis Issues for Field Settings Boston, Houghton Mifflin Company .
  • DOĞAN, SERDAR, AYSU BETIN-CAN, and VAHID GAROUSI (2014). Web application testing: A systematic literature review. Journal of Systems and Software 91: 174-201.
  • ENGSTRÖM, EMELIE, PER RUNESON, and MATS SKOGLUND (2010). A systematic review on regression test selection techniques. Information and Software Technology 52.1: 14-30.
  • ENGSTRÖM, EMELIE, and PER, RUNESON (2011). Software product line testing-a systematic mapping study. Information and Software Technology 53.1: 2-13.
  • ENGSTRÖM, EMELIE (2013). Supporting Decisions on Regression Test Scoping in a Software Product Line Context-from Evidence to Practice. Dissertation. Lund University.
  • JURISTO, NATALIA; MORENO, ANA M. (2001). Basics of Software Engineering Experimentation Kluwer Academic Publishers.
  • KITCHENHAM, BARBARA, et al. (2010). Systematic literature reviews in software engineering-a tertiary study. Information and Software Technology 52.8: 792-805.
  • KITCHENHAM, BARBARA A.; DYBA, TORE; JORGENSEN, MAGNE (2004). Evidence-based software engineering. Internacional Conference on Software Engineering (ICSE’ 04 ). Edinburgh, UK : [s.n.], 2004 . p. 273-281
  • LI, YUAN-FANG, PARAMJIT K. DAS, and DAVID L. DOWE (2014). Two decades of Web application testing - A survey of recent advances. Information Systems 43: 20-54.
  • McGRAW, G. (2006). Software Security - Building Security In São Paulo, Addison-Wesley.
  • MENDES, RENATO (2006): Quando a empresa não sabe o que fazer do seu site. http://www.milajuns.com.br/quando-a-empresa-nao-sabe-o-que-fazer-de-seu-site/ Accessed 25-Oct-2014.
    » http://www.milajuns.com.br/quando-a-empresa-nao-sabe-o-que-fazer-de-seu-site/
  • MILLER, JAMES (2000). Applying meta-analytical procedures to software engineering experiments. Journal of Systems and Software, 54.1: 29-39.
  • MORAN, JOSÉ MANUEL (1995). Novas tecnologias e o reencantamento do mundo. Tecnologia educacional 23.126: 24-26.
  • PERRY, DEWAYNE E., ADAM A. PORTER, and LAWRENCE G. VOTTA (2000). Empirical studies of software engineering: a roadmap. Proceedings of the conference on The future of Software engineering . ACM.
  • PRESSMAN, R. (1992). Software Engineering - A Practioner’s Approach New York, Third Edition, McGraw Hill International Edition.
  • SCALET, D. (1995). Avaliação da Qualidade do Produto de Sotfware. Workshop da Qualidade e Produtividade em Software e IX SBES/SBC
  • SHULL, F., SINGER, J., and SJOBERG, D. I. K. (2008). Guide to Advanced Empirical Software Engineering London, Springer-Verlag.
  • SJØBERG, D. I., DYBÅ, T., and JØRGENSEN, M. (2007). The Future of Empirical Methods in Software Engineering Research. In 2007 Future of Software Engineering (May 23 - 25, 2007 ). International Conference on Software Engineering . IEEE Computer Society, Washington, DC, 358-378
  • TRAVASSOS, G. H., GUROV, D. ; AMARAL, E. A. G. (2002): Introdução à Engenharia de Software Experimental. COPPE/UFRJ, Rio de Janeiro, Relatório técnico: RT-ES-590/02, http://cultura.ufpa.br/cdesouza/teaching/methods/6-ES-Experimental.pdf Accessed 12-Dec-2014.
    » http://cultura.ufpa.br/cdesouza/teaching/methods/6-ES-Experimental.pdf
  • WAINER, J. (2007): Métodos de pesquisa quantitativa e qualitativa para a Ciência da Computação. http://www.pucrs.br/famat/viali/mestrado/mqp/material/textos/Pesquisa.pdf Accessed 01-Nov-2014.
    » http://www.pucrs.br/famat/viali/mestrado/mqp/material/textos/Pesquisa.pdf
  • WOHLIN, C., RUNESON, P., HÖST, M., OHLSSON, M. C., REGNELL, B., and WESSLÉN, A. (2012). Experimentation in software engineering Berlin, Springer-Verlag Berlin Heidelberg.
  • ZELKOWITZ, MARVIN V., WALLACE, DOLORES R. (1998). Experimental models for validating computer technology. IEEE Computer 31.5: 23-31.

ANNEX


References of Primary Studies

Publication Dates

  • Publication in this collection
    18 Oct 2018
  • Date of issue
    2018

History

  • Received
    04 July 2016
  • Accepted
    03 May 2018
TECSI Laboratório de Tecnologia e Sistemas de Informação - FEA/USP Av. Prof. Luciano Gualberto, 908 FEA 3, 05508-900 - São Paulo/SP Brasil, Tel.: +55 11 2648 6389, +55 11 2648 6364 - São Paulo - SP - Brazil
E-mail: jistemusp@gmail.com