Using Software R in research in occupational therapy

Ramos, Maysa Marinho Antunes; Ramos, Pedro Luiz; Louzada, Francisco; Barba, Patrícia Carla de Souza Della

doi:10.4322/2526-8910.ctoCB1625

Abstract

In this paper, it is presented a simple guide for researchers in occupational therapy to perform basic statistical analysis in a flexible and independent way, using the R software, that is a free open source software which its popularity has been increased considerably in many fields. We have presented a step-by-step guide about how to install such software, it is also discussed the necessary steps to include the data set and perform basic statistical analysis, such as the calculation of sample size, basic statistics, graphical presentation, hypothesis tests, and the linear correlation test. The dataset considered in this study comes from a research in occupational therapy and the topics considered are result of the common statistical procedures that were face in the course of the Post Graduation Program in Occupational Therapy at the Federal University of São Carlos (UFSCar), in which were possible to find the principal statistical procedures used by the researchers in applications.

Keywords:
Statistical Analysis; Software R; Occupational Therapy

Resumo

O presente artigo tem como objetivo fornecer subsídios para que pesquisadores em terapia ocupacional possam realizar procedimentos de estatística básica de maneira mais flexível e independente, a partir do sistema R, um software livre e gratuito, cuja popularidade vem aumentando consideravelmente no âmbito acadêmico. Apresentar-se-á, assim, o passo a passo de como instalar e utilizar o programa na realização de leitura e sumarização de dados, bem como no cálculo de estatísticas básicas, nas representações gráficas de dados quantitativos e qualitativos, no cálculo do tamanho amostral e na efetuação de testes de hipóteses e de teste de correlação linear. O banco de dados utilizado nas demonstrações apresentadas é oriundo de uma pesquisa em terapia ocupacional e as análises aqui realizadas resultam de uma demanda compreendida no decorrer de uma disciplina obrigatória do Programa de Pós-Graduação em Terapia Ocupacional da Universidade Federal de São Carlos (UFSCar), por meio da qual foipossívellevantarosprincipaisprocedimentosestatísticosutilizadospelospesquisadoresemsuaspesquisas.

Palavras-chave:
Análise Estatística; Software R; Terapia Ocupacional

1 Introduction

The responsibility of professionals to base their practices on scientific evidenceis one of the essences of research in occupational therapy since the choice of intervention tools and strategies depends on reliable precedents capable of ensuring a greater validity of the work to be performed (KIELHOFNER, 2006KIELHOFNER, G. Research in occupational therapy: Methods of inquiry for enhancing practice. Philadelphia: FA Davis, 2006.).

However, the systematic nature of scientific research implies a range of knowledge that goes beyond the knowledge common to each area, requiring the appropriation of external resources for the performance and understanding of sufficiently consistent studies (SAMPAIO; MANCINI; FONSECA, 2002SAMPAIO, R. F.; MANCINI, M. C.; FONSECA, S. T. Produção científica e atuação profissional: aspectos que limitam essa integração na fisioterapia e na terapia ocupacional. Revista Brasileira de Fisioterapia, São Carlos, v. 6, n. 3, p. 113-118, 2002.).

Some of them are the statistical resources, strongly demanded and often avoided by researchers from other areas.

An article by Ottenbacher and Petersen (1985OTTENBACHER, K.; PETERSEN, P. Quantitative trends in occupational therapy research: Implications for practice and education. American Journal of Occupational Therapy, Bethesda, v. 39, n. 4, p. 240-246, 1985.), published in The American Journal of Occupational Therapy, discussed the implications of the increasing use of quantitative procedures in the occupational therapy literature and revealed that

the expansion of a research literature in the profession has been accompanied by an emerging sophistication in the use of research models and statistical analyzes (OTTENBACHER; PETERSEN, 1985OTTENBACHER, K.; PETERSEN, P. Quantitative trends in occupational therapy research: Implications for practice and education. American Journal of Occupational Therapy, Bethesda, v. 39, n. 4, p. 240-246, 1985., p.240).

According to Sampaio, Mancini and Fonseca (2002SAMPAIO, R. F.; MANCINI, M. C.; FONSECA, S. T. Produção científica e atuação profissional: aspectos que limitam essa integração na fisioterapia e na terapia ocupacional. Revista Brasileira de Fisioterapia, São Carlos, v. 6, n. 3, p. 113-118, 2002.), occupational therapists need to be critical producers and consumers of information. However, there is often discouragement when there is a statistical topic since many of them reveals ignoring the section of statistical analysis when reading scientific articles (KIELHOFNER, 2006KIELHOFNER, G. Research in occupational therapy: Methods of inquiry for enhancing practice. Philadelphia: FA Davis, 2006.). Thus, whenever possible it is essential to engageto learn how to deal with these resources since they are available to help researchers and professionals in this arduous but essential journey to consolidate the profession.

Currently, software capable of generating statistics quickly and easily is found (KIELHOFNER, 2006KIELHOFNER, G. Research in occupational therapy: Methods of inquiry for enhancing practice. Philadelphia: FA Davis, 2006.). Also, more than practicality, it can be freedom and gratuitousness in the accomplishment of such procedures, since the technological barrier with the private software also prevents a greater contact with this essential step to many researches.

Recently, the use of free software has been intensified, including by the constant governmental incentive. Besides focusing on reducing costs, increasing competition and generating jobs, the government sees, above all, greater independence and collaboration in the production and dissemination of knowledge needed for the country’s technological development. Data from the Federal Data Processing Service (Serpro) point to an economy of the Federal Government of approximately R$ 370 million with the use of free software in recent years and this number becomes significant when there are numerous other demands to be met (COSTA, 2009COSTA, G. Governo economiza R$ 370 milhões com sistemas operacionais de computador. Serpro Sede, Brasília, 5 abr. 2009. Disponível em: <http://www.serpro.gov.br/menu/noticias/noticias-antigas/governo-economiza-r-370-milhoes-com-sistemas-operacionais-de-computador>. Acesso em: 23 maio 2018.
http://www.serpro.gov.br/menu/noticias/n... ).

However, although there is a movement in favor of this software, still a lot of money is spent on licenses that, besides being expensive, they have expiration term. In the academic field, there is a strong dependence of the researchers in software paid for the accomplishment of statistical analyzes. In addition to the high cost attributed to them, its use is restricted to a few computers and spaces in laboratories that often cannot be accessed routinely by all.

As the options that have been disseminated in the scientific community to replace paid software, R (R CORE TEAM, 2018), stands out here a free, multiplatform and expandable software, which is gaining popularity in the academic field, and may exceed in the coming years, the use of paid software such as SAS, SPSS, Statistica, Minitab, among others. However, nothing prevents an institution or researcher from using paid software if they wish, but having other possibilities with more obvious advantages at their disposal and choose the one that best suits their needs.

Since there is no available literature that addresses the use of R in the scope of occupational therapy, this article aims to instruct researchers in the area to use this software to obtain basic statistics, giving them greater independence and scientific flexibility.

The database used in the demonstrations presented here comes from a research in occupational therapy titled “The Ages and Stages Questionnaires Brazil (ASQ-BR) as an instrument for screening development in the context of early childhood education” by Della Barba (2014). The main objective of this study was to analyze the performance of children attending early childhood education in a municipality in the interior of the state of São Paulo in an American development screening instrument. The analyzes performed here (reading and summarizing data, sample size calculation, hypothesis tests,and linear correlation test) are the result of a demand comprised in the course of a subject of the Post-Graduation Program in Occupational Therapy of the Federal University of São Carlos (UFSCar).

Therefore, the article is organized as follows:

In Chapter 1, the Software R, the steps for its installation and its layout will be shown. Chapter 2 will demonstrate how to read data in R and add other information. In Chapter 3, the data summarization will be treated, from the calculation of descriptive statistics (variance and standard deviation) to the development of graphical representations of both quantitative and qualitative data. In Chapter 4,the calculation of the sample size of simple random samples will be explained. Chapter 5 shows the commands for performing a hypothesis test in two populations. Finally, Chapter 6 will demonstrate how to perform calculations to verify correlations between the variables.

1.1 Software R

R was developed a priori by Ross Ihaka and Robert Gentleman and later added by collaborators from other parts of the world. R is a computational program directed to statistical and graphical operations widely demanded for the treatment, systematization, and dissemination of informative data (R CORE TEAM, 2018).

Since there are other programs with the same purpose, it is necessary to list the advantages attributed to the use of R to be an option differentiated from the other competitors.

First, it is free software, allowing the researcher to propose new subroutines and implement new methods of analysis according to their need. Second, it is free of charge, therefore it has no expiration time and can be used with more flexibility. Third, because it is multiplatform, it can be run by Windows, Macintosh, and Unix/Linux. Fourth, it is expandable, since it offers from the most basic to the most complex services, for example, new statistical techniques that are published in journals are usually accompanied by packages with functions implemented in R, enabling the access to such methodologies and apply them easily.

Thus, R’s popularity with other programs is justified. The figure below shows that along with the decline of SPSS, there is the rise of R in the academic field (Figure 1).

Figure 1
Number of accesses to different statistical software in google scholar.

1.1.1 Installation

Step 1 - Access the link available in FIOCRUZ (2019).
Step 2 - Choose the platform on which to run R.
Step 3 - Click on “install R for the first time”.
Step 4- Click on “Download R 3.3.3 for”.

After performing the procedures for installing the software, it is ready for use (Figure 2). It is important to note that when performing step 4, an updated version may be available. The software is constantly updated to accommodate new technologies; however, the procedures discussed here do not change for any version.

Figure 2
Installation of R.

1.1.2 Layout

The R layout has a window entitled “console”. It is a space in which the user will insert, change or save the data and the analysis codes to be performed (Figure 3).

Figure 3
R Console.

It is suggested to open a new window called “new script” to facilitate the operational process so the data, as well as the commands, can be organized and transported to the console immediately, without typing errors, by the combination Ctrl R. To open a companion window just go to “file” and click on the “new script” option.

2 Data Reading

Here, how to read data in R will be shown (Table 1).

Gender	Category	Scores
M	X1	25	60	55	55	60	60	55	60	55	60
	X2	60	60	55	55	60	60	60	15	60	55
	X3	55	40	30	30	55	60	45	60	60	50
	X4	50	35	40	55	55	50	50	50	60	60
	X5	50	50	50	60	60	55	50	60	60	45
F	Y1	25	60	50	50	40	45	55	60	50	40	60
	Y2	50	55	50	60	50	35	60	60	60	50	55
	Y3	20	30	50	40	30	40	55	50	55	45	50
	Y4	35	40	45	60	50	45	60	50	60	30	60
	Y5	30	60	40	45	50	25	40	40	45	60	60

Brasil

Brasil

Using Software R in research in occupational therapy

Abstract

Resumo

1 Introduction

1.1 Software R

1.1.1 Installation

1.1.2 Layout

2 Data Reading

3. Data Summarization

3.1 Measures of central tendency and variability

3.2 Chart representation

3.2.1 Quantitative data

3.2.2 Qualitative data

4. Calculation of Sampling Size

4.1 Sample size for a population

5 Hypotheses Testing in Two Populations

5.1 Hypothesis of normality

5.2 Hypothesis of accepted normality

5.3 Hypothesis of rejected normality

6 Linear Correlation

6.1 Calculation of Pearson’s linear correlation coefficient

6.2 Hypothesis testing

7 Final Considerations

References

Publication Dates

History