Open-access AgroR: An R package and a Shiny interface for agricultural experiment analysis

ABSTRACT.

Statistical analysis is central to agricultural research, but the complexity of statistical methodologies and programming languages, such as R, often poses challenges for researchers. To address these difficulties, we present AgroR, a comprehensive R package and Shiny web application (https://uel.br/fisher.uel.br/AgroR_shiny) designed to streamline the analysis of agricultural experiments. AgroR supports a wide range of experimental designs, offering tools for analysis of variance, multiple comparison tests, and assumption validation, as well as functions for exploratory data analysis and graphical representation. The package is built for accessibility, allowing users with limited programming skills to perform advanced statistical analyses using an intuitive interface. The Shiny application enhances usability by providing a graphical interface that simplifies the running of statistical tests and visualization of results. AgroR includes functions for analyzing complex experimental designs, such as factorial and split-plot designs, and offers additional tools for graphical outputs and dataset management. Available through the CRAN repository and accessible via a web browser, AgroR has been widely adopted, with thousands of downloads and citations across the scientific literature. AgroR significantly lowers the barriers to statistical analysis in agricultural research by providing a user-friendly interface and robust statistical capabilities, thereby enabling more accurate and reliable conclusions.

Keywords:
experimental statistics; software R; agriculture

Introduction

Experimental statistics play a critical role in agricultural science, encompassing every stage of experimentation, from planning and execution to data analysis and interpretation. Although the foundations of experimental statistics date back nearly a century to Ronald A. Fisher’s pioneering work in 1926, a persistent gap remains between theoretical statistical knowledge and its practical applications in agricultural research. This gap can hinder researchers from fully utilizing the potential of statistical tools in their studies.

Analysis of variance (ANOVA) is among the most common statistical methods applied in agricultural experiments, as it efficiently partitions variances across known or unknown sources of variation. However, for ANOVA to produce valid results, four key assumptions must be satisfied: (i) normality of errors, (ii) homogeneity of variances, (iii) independence of errors, and (iv) additivity of factors (Knief & Forstmeier, 2021). Despite the importance of these assumptions, researchers often neglect them because of limited statistical knowledge or misconceptions regarding their relevance (Bailar, 1986). Violating these assumptions can compromise the reliability of parametric tests, which in turn may result in erroneous conclusions regarding treatment effects (Lúcio et al., 2012; Martin & Storck, 2008; Melo et al., 2020; Xu et al., 2013).

The growing complexity of agricultural experiments has driven the development of numerous software tools to assist researchers in conducting statistical analyses (Nunes et al., 2015). The R language (R Core Team, 2022) has gained prominence, particularly among high-impact journals, owing to its open-source nature, extensive user community, and flexibility. However, many users-particularly those who engage with R only sporadically-find it difficult to master both statistical theory and the R programming language, which can be a barrier to broader adoption within the scientific community.

While R offers a wide array of add-ons, known as packages, to enhance its functionality (Arnhold, 2014; Kormann et al., 2019), many available tools for conducting essential analyses such as ANOVA, residual analysis, and contrast tests are often too complex for users without a strong statistical background. This challenge is further magnified in more intricate experimental designs (Arnhold, 2013). In response to these challenges, Shiny applications were introduced in 2012, enabling the development of graphical user interfaces that simplify user experience (Chang et al., 2022). These Shiny applications can be hosted on servers, allowing users to access them via a web browser without requiring local installation, provided they have Internet access.

Given the current challenges in making statistical methods more usable, this study aims to present and summarize the AgroR package and its accompanying Shiny application. AgroR was developed to streamline the analysis of agricultural experiment data, offering an intuitive and user-friendly interface with a wide range of statistical functions. These tools have the potential to significantly simplify the data analysis process, making robust statistical methods more accessible to agricultural researchers and thereby enhancing the quality of agricultural research.

Material and methods

Package development

The package was developed using the R software (version 4.2.1; R Core Team, 2022) and the RStudio interface (version 1.2.5001). The development process incorporated roxygen2 (Wickham et al., 2017) for documenting functions and devtools (version 1.3.2; Wickham & Chang, 2017) to facilitate package development. The package compiles functions from other packages and R’s base system, streamlining them to facilitate user analysis and comprehension. For the ANOVA with fixed effects, the aov() function from the base R (stats) was used, whereas data transformation was performed using the Box and Cox (1964) method through the MASS package (Ripley et al., 2013). Multiple comparison tests, such as Tukey, Duncan, and least significant difference, and non-parametric tests, such as Kruskal-Wallis and Friedman, were implemented based on the agricolae package (Mendiburu & Simon, 2015). The Scott-Knott clustering method was also included. The lmtest (Zeileis & Hothorn, 2002), nortest (Gross & Ligges, 2015), and car (Fox et al., 2012) packages were used for assumption testing in ANOVA. All graphics were generated using the ggplot2 package (Wickham & Chang, 2017).

Shiny application development

The AgroR application is a graphical interface built in R (version 4.2.1; R Core Team, 2022), utilizing the shiny package (Chang et al., 2022) to create a web application that facilitates the planning and analysis of various experimental designs. A user interface and server environment were developed using the shiny package. The application can be accessed via the server of the State University of Londrina (https://fisher.uel.br/AgroR_shiny) or in its Portuguese version (https://fisher.uel.br/AgroR_shiny.pt), with intellectual property registration number BR512023003032-5.

Results and discussion

AgroR package

Functions for experimental design

The sketch function is designed to develop an experimental layout based on the selected experimental design (Figure 1). This function allows for the creation of sketches for experiments in a completely randomized design (CRD), randomized block design (RBD), Latin square design (LSD), two- and three-factor factorial experiments, strip plots, split-plots in CRD and RBD, and split-split plots in RBD. The trat and r arguments are mandatory, and depending on the experimental design, trat1 and trat2 must also be specified. The designargument is used to change the experimental design, with CRD being the default. The user can add alleys using the add.streets.x and add.streets.y arguments, specifying the vectors that define each street (Figure 1). It is also possible to provide the sketch in ID format and generate a .csv file based on the experimental layout. For more details, refer to? sketch.

Functions for experimental analysis

The package implements analyses for experiments in CRD, RBD, and LSD, based on single-factor or factorial analyses with two or three factors, with or without additional treatments, split-plots, split-split plots, factorial split-plots, and joint analysis of experiments (Figure 2). The package also offers hypothesis testing for one or two population means. Factors can be quantitative or qualitative, as defined by the user. ANOVA assumptions can be specified, and the normality of errors (Shapiro-Wilk [default], Lilliefors, Cramer-Von Mises, Shapiro-Francia, and Anderson-Darling), homogeneity of variances (Bartlett [default], Hartley, and Levene), and independence of errors (Durbin-Watson) can be tested.

Figure 1
Example of experimental layouts using the AgroR package and/or application.

Figure 2
Diagram of experimental designs and their respective functions in the AgroR package.

If an assumption is violated, the package provides a warning about the invalidity of the model and suggests potential solutions. Within the function, the user can define non-parametric tests for CRD and RBD or perform data transformations (Box & Cox, 1964) for other experimental designs. Notably, when the data are transformed, the final plots use the means of the original data. The implemented non-parametric statistical methods include the Kruskal-Wallis test for CRD with corrections using various methods (Holm, Hommel, Hochberg, Bonferroni, BH, BY, and FDR) or the Friedman test for RBD using the LSD method. Dunn and Dunnett tests are also available. Alternatively, for CRD and RBD, when the nature of the variable is known, it is possible to use a generalized linear model with binomial, quasibinomial, Poisson, or quasipoisson families, with contrasts performed using the multcomp (Hothorn et al., 2016) and emmeans (Lenth, 2023) packages.

Graphical functions

Most analysis functions return graphical outputs. However, the package also provides additional graphical representation options. The user must define the output of the respective analysis as an object and allocate it to the desired graphical function (Figure 3). Most graphical parameters are imported from the original output of the respective function. The implemented graphical functions range from simple plots, such as bar graphs similar to those returned by analysis functions, to more complex ones, such as bar_graph2, sk_graph, barfacet, bargraph_onefactor, bargraph_twofactor, barplot_positive, and others. The package also includes functions for analyzing correlations between variables, such as plot_cor, corgraph, and cor_ic.

Figure 3
Example of graphs generated by analysis functions belonging to the "graphics" class: bar_graph (A); sk_graph (B); bar_graph2 C); barfacet (D).

Datasets

The AgroR package is dedicated to agricultural science, especially agronomy. The 22 datasets originate from experiments conducted at the State University of Londrina, literature examples (Barbin, 2013), or fictional data. These datasets demonstrate the use of all the functions implemented in the package. The data can be called using the data() function.

Additional functions

Finally, several useful functions exist, each with a specific purpose. Noteworthy are thesketchfunction mentioned earlier, a function for calculating the area under the disease progress curve, which is primarily used in plant pathology (Shaner & Finney, 1977), and a function for calculating confidence intervals. Additionally, to facilitate the construction of tables with results, the summarise_anova (Figure 4) and summarise_dunnett functions were created, allowing users to simplify the results of multiple analyses into a single output for qualitative factors and experiments involving up to two factors of interest.

Figure 4
Summarization analysis process using the summarise_anova function and its integration with the knitr and rmarkdown packages for generating automatic tables.

Shiny application

The analysis functions were implemented using the AgroR package, with the main experimental designs included in the application subdivided into four sections: descriptive analysis, experimental statistics, joint analysis, and hypothesis testing (Figure 5). Under "experimental statistics,” there are options for "experimental sketch,” "analysis of variance,” "Dunnett," "tabulation examples,” "analysis of variance (+1 variable simultaneously)," and "non-parametric analysis.” For "joint analysis,” there are "experiment similarity" and "joint analysis," and finally, for "hypothesis testing,” options for "one mean" and "two means" are available. All analyses are designed to cover various situations that researchers may encounter in agricultural sciences, following a step-by-step approach to guide data interpretation.

In the “descriptive analysis” tab, exploratory measures of position and dispersion, such as mean, median, maximum, minimum, sample variance, sample standard deviation, and coefficient of variation, can be calculated for the dataset as a whole or separated by treatment in CRD, RBD, or LSD. Factorial arrangements, such as factorial schemes and split-plot designs, can also be calculated. An interactive box plot is also provided to help identify possible outliers.

The "experimental sketch" tab is designed to assist in planning experimental layouts. This tab is based on the sketch function of the AgroR package, where various experimental designs can be drawn, such as single-factor experiments in CRD, RBD, and LSD, as well as factorial arrangements, such as two- and three-factor factorial schemes, split-plots, and split-split plots. Users can insert streets to divide the plots, generate plot identification, specify the positions of the replications, and adjust figure dimensions, label sizes, and other parameters.

Figure 5
Overview of the structure of the AgroR Shiny app and its main implemented functions.

In the "analysis of variance" subsection, the user must define the experimental design (layout), number of factors, presence or absence of a blocking factor, significance level for the ANOVA and complementary tests, and need for data transformation. In the "assumptions" sub-tab, users can select the desired normality and homogeneity tests. In the "complementary tests" sub-tab, the type of treatment, multiple comparison test for qualitative factors, and polynomial degree for quantitative factors can be specified. The last sub-tab, "graphics," allows users to define the graphical parameters. Usage questions can be addressed in the "help" section. Moreover, within the "experimental statistics" tab, users can analyze data using non-parametric tests or perform Dunnett's test to compare a control with other treatments.

In the "joint analysis" tab, users can perform the joint analysis of experiments following the procedure described by Barbin (2013) and Ferreira (2018). Provided that the experiments show a homogeneity of variances and no significant treatment-by-experiment interactions are detected, the application returns a single analysis; otherwise, it provides individual analyses. The "experiment similarity" tab can be used to identify experiments with similar variability, where users should input the mean square error values for each experiment. In the "hypothesis testing" tab, users can conduct hypothesis tests for one or two populations using parametric and non-parametric, one-tailed, or two-tailed tests. These tests are widely used to assess whether treatment responses fall within a predefined range from the literature.

AgroR package and AgroR Shiny application in numbers

As of September 23, 2024, the AgroR package has had 30,084 downloads through the CRAN repository, with an average of 1,247 monthly downloads. According to Google Scholar, the package has been cited seven times (https://scholar.google.com/). However, web searches revealed citations in fourteen scientific articles, four doctoral thesis, one master's dissertation, six undergraduate theses, three conference publications, and one technical publication. The English version of the AgroR Shiny application has 1,869 users in 52 countries, with 2,457 accesses and an average usage time of 9 minutes and 8 seconds. The AgroR Shiny application in its Portuguese version has 751 users in 28 countries, with 800 accesses and an average usage time of 12 minutes.

Conclusion

The AgroR package and AgroR Shiny application were developed to meet the need for an accessible and comprehensive tool for analyzing agricultural experiments. With an intuitive interface and complete functionality, these tools offer users a simplified solution for performing complex statistical analyses and obtaining accurate results. AgroR and its application are expected to continue to facilitate researchers' work, promoting the use of robust statistical methods in a practical and efficient manner, regardless of technical expertise.

Acknowledgements

The authors thank CNPq (National Council for Scientific and Technological Development) for granting a doctoral scholarship to the first author

References

  • Arnhold, E. (2013). Pacote em ambiente R para análise de variância e análises complementares. Brazilian Journal of Veterinary Research and Animal Science, 50(6), 488-492. https://doi.org/10.11606/issn.1678-4456.v50i6p488-492
    » https://doi.org/https://doi.org/10.11606/issn.1678-4456.v50i6p488-492
  • Arnhold, E. (2014). Pacote em ambiente R para automatizar estatísticas descritivas. Sigmae, 3(1), 36-42.
  • Bailar, J. (1986). Science, statistics, deception. Annals of Internal Medicine, 104(2), 259-260. https://doi.org/10.7326/0003-4819-104-2-259
    » https://doi.org/https://doi.org/10.7326/0003-4819-104-2-259
  • Barbin, D. (2013). Planejamento e análise estatística de experimentos agronômicos Editora Mecenas.
  • Box, G. E. P., & Cox, D. R. (1964). An analysis of transformations. Journal of the Royal Statistical Society: Series B (Methodological), 26(2), 211-243. https://doi.org/10.1111/j.2517-6161.1964.tb00553.x
    » https://doi.org/https://doi.org/10.1111/j.2517-6161.1964.tb00553.x
  • Chang, W., Cheng, J., Allaire, J., Xie, Y., & McPherson, J. (2022). Shiny: Web application framework for R https://cran.r-project.org/web/packages/shiny/index.html
    » https://cran.r-project.org/web/packages/shiny/index.html
  • Ferreira, P. V. (2018). Estatística experimental aplicada às Ciências Agrárias UFV.
  • Fox, J. S., Weisberg, D., Adler, D., Bates, G., Baud-Bovy, S., Ellison, S., & Heiberger, R. (2012). Package 'car' R Foundation for Statistical Computing.
  • Gross, J., & Ligges, U. (2015). Nortest: tests for normality (R package - version 1.0-4). https://CRAN.R-project.org/package=nortest
    » https://CRAN.R-project.org/package=nortest
  • Hothorn, T., Bretz, F., Westfall, P., Heiberger, R. M., Schuetzenmeister, A., & Scheibe, S. (2016). Package 'multcomp'. Simultaneous inference in general parametric models Project for Statistical Computing.
  • Knief, U., & Forstmeier, W. (2021). Violating the normality assumption may be the lesser of two evils. Behavior Research Methods, 53, 2576-2590. https://doi.org/10.3758/s13428-021-01587-5
    » https://doi.org/https://doi.org/10.3758/s13428-021-01587-5
  • Kormann, R., Rosa, E. N., Paixão, C. A., Ferreira, E. B., & Nogueira, D. A. (2019). GExpDes: Interface gráfica para o ExpDes. Sigmae, 8(2), 170-179.
  • Lenth, R. (2023). Emmeans: Estimated marginal means, aka least-squares means (R package - version 1.8.5). https://CRAN.R-project.org/package=emmeans
    » https://CRAN.R-project.org/package=emmeans
  • Lúcio, A. D., Schwertner, D. V., Haesbaert, F. M., Santos, D., Brunes, R. R., Ribeiro, A. L., & Lopes, S. J. (2012). Violação dos pressupostos do modelo matemático e transformação de dados. Horticultura Brasileira, 30(3), 415-423. https://doi.org/10.1590/S0102-05362012000300010
    » https://doi.org/https://doi.org/10.1590/S0102-05362012000300010
  • Martin, T. N., & Storck, L. (2008). Análise das pressuposições do modelo matemático em experimentos agrícolas no delineamento blocos ao acaso. In T. N. Martin, & M. F. Ziech (Eds.), Sistemas de produção agropecuária (pp. 177-196). UTFPR.
  • Melo, R. C., Trevisani, N., Santos, M., Guidolin, A. F., & Coimbra, J. L. M. (2020). Statistical model assumptions achieved by linear models: classics and generalized mixed. Revista Ciência Agronômica, 51(1), 1-9. https://doi.org/10.5935/1806-6690.20200015
    » https://doi.org/https://doi.org/10.5935/1806-6690.20200015
  • Mendiburu, F., & Simon, R. (2015). Agricolae-Ten years of an open source statistical tool for experiments in breeding, agriculture and biology. PeerJ, 1, 1-18. https://doi.org/10.7287/peerj.preprints.1404v1
    » https://doi.org/https://doi.org/10.7287/peerj.preprints.1404v1
  • Nunes, C. A., Alvarenga, V. O., Souza Sant'ana, A., Santos, J. S., & Granato, D. (2015). The use of statistical software in food science and technology: Advantages, limitations and misuses. Food Research International, 75, 270-280. https://doi.org/10.1016/j.foodres.2015.06.011
    » https://doi.org/https://doi.org/10.1016/j.foodres.2015.06.011
  • R Core Team. (2022). R: A language and environment for statistical computing R Foundation for Statistical Computing. http://www.R-project.org/
    » http://www.R-project.org/
  • Ripley, B., Venables, B., Bates, D. M., Hornik, K., Gebhardt, A., Firth, D., & Ripley, M. B. (2013). Package 'MASS'. Cran R, 538, 113-120.
  • Shaner, G., & Finney, R. (1977). The effect of nitrogen fertilization on the expression of slow-mildewing resistance in Knox wheat. Phytopathology, 67(8), 1051-1056. https://doi.org/10.1094/Phyto-67-1051
    » https://doi.org/https://doi.org/10.1094/Phyto-67-1051
  • Wickham, H., & Chang, W. (2017). ggplot2: Create elegant data visualizations using the grammar of graphics (R package - version 2.1). https://CRAN.R-project.org/package=ggplot2
    » https://CRAN.R-project.org/package=ggplot2
  • Wickham, H., Chang, W., Danenberg, P., & Eugster, M. (2017). roxygen2: In-line documentation for R (R package - versão 6.1). https://CRAN.R-project.org/package=roxygen2
    » https://CRAN.R-project.org/package=roxygen2
  • Xu, W., Li, W., & Song, D. (2013). Testing normality in mixed models using a transformation method. Statistical Papers, 54, 71-84. https://doi.org/10.1007/s00362-011-0411-4.
    » https://doi.org/https://doi.org/10.1007/s00362-011-0411-4
  • Zeileis, A., & Hothorn, T. (2002). Diagnostic checking in regression relationships. R News, 2(3), 7-10.

Publication Dates

  • Publication in this collection
    02 June 2025
  • Date of issue
    2025

History

  • Received
    25 Sept 2024
  • Accepted
    04 Oct 2024
location_on
Editora da Universidade Estadual de Maringá - EDUEM Av. Colombo, 5790, bloco 40, 87020-900 - Maringá PR/ Brasil, Tel.: (55 44) 3011-4253, Fax: (55 44) 3011-1392 - Maringá - PR - Brazil
E-mail: actaagron@uem.br
rss_feed Acompanhe os números deste periódico no seu leitor de RSS
Reportar erro