Acessibilidade / Reportar erro

Bivariate Copula-based Linear Mixed-effects Models: An Application to Longitudinal Child Growth Data

ABSTRACT

Multiple longitudinal outcomes are common in public health research and adequate methods are required when there is interest in the joint evolution of response variables over time. However, the main drawback of joint modeling procedures is the requirement to specify the joint density of all outcomes and their correlation structure, as well as numerical difficulties in statistical inference, when the dimension of these outcomes increases. To overcome such difficulty, we present two procedures to deal with multivariate longitudinal data. We first present an univariate approach, for which linear mixed-effects models are considered for each response variable separately. Then, a novel copula-based modeling is presented, in order to characterize relationships among the response variables. Both methodologies are applied to a real Brazilian data set on child growth.

Keywords:
bivariate copula; linear mixed-effects model; longitudinal growth data; time-varying dependence

RESUMO

Estudos longitudinais com múltiplas variáveis respostas são comuns na área de saúde pública e, consequentemente, métodos estatísticos adequados são requeridos quando há interesse em analisar a evolução temporal de uma ou mais variáveis resposta. Contudo, especificar a função de densidade conjunta de todas as variáveis respostas e a estrutura de correlação entre elas, bem como as dificuldades numéricas encontradas na inferência estatística quando a dimensão do problema aumenta, são os principais obstáculos dos procedimentos de modelagem multivariada. Como alternativas, neste artigo apresentamos duas propostas para lidar com dados longitudinais multivariados. Primeiramente, mostramos uma abordagem univariada, com modelos lineares mistos ajustados a cada uma das variáveis respostas separadamente. Em seguida, apresentamos uma modelagem conjunta dessas variáveis, por meio do uso de funções cópula. Ambas as metodologias são aplicadas a um conjunto de dados reais bivariados referentes ao crescimento infantil de crianças brasileiras.

Palavras-chave:
cópula bivariada; modelos lineares mistos; dados longitudinais de crescimento; dependência variante no tempo

1 INTRODUCTION

Longitudinal data or repeated measures data arise when multiple observations are made on the same subject or unit of analysis over time 3333. B.T. West, K.B. Welch & A.T. Galecki. "Linear Mixed Models: A Practical Guide Using Statistical Software". Chapman & Hall, Boca Raton, 2 ed. (2014)..

In practice, it is quite common in clinical trials and social science settings for multiple outcomes to be measured repeatedly within a set of study participants. Some examples are hearing thresholds measured on both ears of a set of participants, HIV studies with CD4 T-cell counts and viral RNA copy numbers are collected longitudinally on each participant, toxicological studies where doses of a toxic agent and some information on its deleterious effect are measured jointly (see 3232. G. Verbeke , S. Fieuws, G. Molenberghs &M. Davidian. The analysis of multivariate longitudinal data: a review. Statistical Methods in Medical Research, 23(1) (2014), 42-59.). Understanding relationships among multivariate outcomes is challenging due to the complex correlated nature of the problem, whilst providing a unique opportunity in studying the joint evolution of multiple response variables over time.

A number of approaches to joint modeling of multivariate longitudinal data have been proposed in the statistical literature (see, e.g., (4, (66. D.R. Cox & N. Wermuth. Response models for mixed binary and quantitative variables. Biometrika, 79(3) (1992), 441-461.), (1212. G. Fitzmaurice, M. Davidian, G. Verbeke & M. Geert. "Longitudinal Data Analysis. Handbooks of Modern Statistical Methods". Chapman & Hall/CRC, New York (2009).), (1717. R.V. Gueorguieva & A. Agresti. A correlated probit model for joint modeling of clustered binary and continuous reponses. Journal of the American Statistical Association, 96(455) (2001), 1102-1112.), (2424. X. Liu, M.J. Daniels & B. Marcus. Joint models for the association of longitudinal binary and continuous processes with application to a smoking cessation trial. Journal of the American Statistical Association , 104(486) (2009), 429-438.). When the joint distribution of the two or more outcomes is known, the statistical inference is relatively straightforward using either marginal or joint approaches.

Another possibility is the use of a conditional model, where the joint likelihood of the two or more responses is factorized. The main drawback of these joint modeling procedures is the requirement to specify the joint density of all outcomes or, at least, the correlation structure of the data, which can lead to parsimony and/or computation (optimization) problems, as well as to numerical difficulties in statistical inference, when the dimension of these outcomes increases.

In a multivariate longitudinal study, the relationship between a response and predictors, or the association between a pair of responses, may change over time. In addition to this, the degree of correlation can be of greater interest to the analyst than relation between covariates and marginal mean of outcomes. When the former is the main objective in the multivariate longitudinal analysis, a third alternative for the analysis is to join a set of marginal distributions using a copula function.

This method separates the multivariate joint distribution into two parts: one describing the interdependency of the probabilities, the other describing the marginal distributions alone. Dependence modeling using copula has become very popular the last years (see, e.g., 2626. R.B. Nelsen. "An introduction to copulas". Springer Series in Statistics. Springer, New York, 2 ed. (2006), 269 pp.), (2020. H. Joe. "Dependence modeling with copulas", volume 134 of Monographs on Statistics and Applied Probability. Chapman & Hall, Boca Raton, FL (2014), 462 pp.). According to 2222. P. Lambert & F. Vandenhende. A copula-based model for multivariate non-normal longitudinal data: Analysis of a dose titration safety study on a new antidepressant. Statistics in Medicine , 21 (2002), 3197-3217.), (1414. E.W. Frees & P. Wang. Copula credibility for aggregate loss models. Insurance: Mathematics & Economics, 38(2) (2006), 360-373.), (3131. J. Sun, E.W. Frees & M.A. Rosenberg. Heavy-tailed longitudinal data modeling using copulas. Insurance Mathematics & Economics, 42 (2008), 817-830. and 88. F. Domma, S. Giordano & P.F. Perri. Statistical modeling of temporal dependence in financial data via a copula function. Comm. Statist. Simulation Comput., 38(3-5) (2009), 703-728., applications to serial dependence in longitudinal data have a huge potential once the marginal distribution of the process at each point in time can be modelled arbitrarily, while dependence over time is captured by a multivariate copula. This kind of approach is not the primary focus of this paper and will be studied further in future work.

In this paper, we present two procedures, in terms of strengths and weaknesses, to deal with multivariate longitudinal data. In particular, there is an interest in revealing time-varying dependence relationships in the child growth patterns using data from a longitudinal study conducted with 150 children measured monthly, from birth to six months of life. To achieve this goal, we exploit the modularity of copula-based modeling. Considering the applied nature of this work, we make sacrifices to balance the interpretability, complexity and computation of the model. Here, weight and height are outcome variables (i.e. the variables of interest) once were measured monthly, from birth to six months of life.

The paper is organized as follows. In Section 2, we present the univariate and bivariate longitudinal models used in this study. Section 3 concerns statistical inference for them. Section 4 provides an application to a real data set (longitudinal child growth data). Finally, Section 5 ends the paper with some final remarks and directions for future work.

2 METHODS

In this section, we present some statistical models that can be used for analyzing multivariate (bivariate, here) longitudinal data. The idea is to show first an univariate approach, where a linear mixed-effects model is considered for each of the two response variables (weight and height) separately, thus ignoring the (possible) dependence between them. After that, the joint modeling of the two outcomes are based on copulas to relate their marginal distributions at each observation time, and also to describe the dependence structure between them. Therefore, the association between the repeated measurements in a given occasion and the temporal evolution of the anthropometric measures of the individuals are two important aspects that a flexible model should be able to describe.

2.1 Univariate Approach

Linear mixed-effects models (also known as multilevel models, hierarchical linear models, random-effects models, among other names), which contain a mixture of fixed effects and random effects, provide a way to deal with longitudinal responses within a subject. The basic idea is to model additional sources of variability in the response by introducing additional terms into the relationship between mean response and explanatory variables, which are random quantities rather than fixed parameters. According to 77. P.J. Diggle, P.J. Heagerty, K.Y. Liang & S.L. Zeger. "Analysis of longitudinal data", volume 25 of Oxford Statistical Science Series. Oxford University Press, Oxford, 2 ed. (2002), 379 pp., these models consist of two parts: one represents intra-individual (or within-subject) change; and the other represents individual differences in these changes (between-subject or inter-individual variation). The inclusion of subject-specific random effect allows for heterogeneity between subjects and also induces within-subject correlation into the model 3232. G. Verbeke , S. Fieuws, G. Molenberghs &M. Davidian. The analysis of multivariate longitudinal data: a review. Statistical Methods in Medical Research, 23(1) (2014), 42-59..

Let Y it be the response of the i-th child at time t, with i=1, 2,..., m and t=0,1,..., ni. In this work, m=150, ni, varies between two and six, and it is considered two response variables: the height (Y 1it ; in centimeters, cm) and the weight (Y 2it ; in grams, g) of the studied children, which were collected monthly from birth (t=0) to six months of life (t=6).

A special case of the mixed-effects model is the random intercept model, which can be viewed as the deviation of the i-th subject-specific of the outcome from the population mean of the response. Although the simplicity of the mixed model with only random intercept is appealing, it poses the restriction that the correlation between the repeated measurements remains constant over time. An extension of this model is a mixed model with random intercept and slope, where an additional random-effects term is included, describing that the rate of change in covariates differs between subjects. In general, the linear mixed-effects model can be expressed as follows:

Y i t = X i t ' β + Z i t ' ω i + ε i t , (2.1)

where Xit=Xit0, Xit1,...,Xitp-1' is a known set of p covariates for the i-th child, associated with the unknown fixed-effects parameters β=β0, β1,..., βp-1'. In our case, the time-fixed (or time-invariant) characteristics that make up the ni×p design matrix Xi=Xi0 Xi1 ... Xip-1 are: vector of ones (Xi0=1, due to the fixed intercept), child’s gender (Xi1; female or male) and total duration of breastfeeding (TDB) (Xi2; in months), in addition to the time-dependent variables: child’s age (Xi3; corresponding month) and child’s hemoglobin level (Xi4; in grams per deciliter, g/dl). Thus, p=5, here (four covariate coefficients - β 1 , β 2 , β 3 , β 4 - plus the fixed intercept - β 0). Moreover, Zit=Zit0, Zit1,...,Zitq-1' is a set of q covariates that characterizes random variation in the response attributable to among-unit sources, and it is associated with the random-effects ωi=ωi0, ωi1,..., ωiq-1', which characterize among-unit variation (specific to child i). Finally, (it represents the random error, which describes variation due to sources, like within-unit fluctuations and measurement error. The usual assumptions are: ω i has a multivariate (q-variate) normal distribution with zero mean vector and covariance matrix Σ, i.e. ωi~Nq0, Σ, and εi=εi1, εi2,..., εini' has a multivariate (n i -variate) normal distribution with zero mean and covariance matrix (diagonal) Ri=σ2Vi, i.e. εi~Nni0, Ri. It is also assumed that random effects and random error are both independent across subjects given the covariates. Furthermore, the most common choice for R i is the model that says variance is the same at all time points for all units, i.e. Ri=σ2 Ini, where Ini stands for the identity matrix of order n i .

It is important to highlight that additional correlation among the errors can be accommodated by allowing for a more general covariance structure in the model. 77. P.J. Diggle, P.J. Heagerty, K.Y. Liang & S.L. Zeger. "Analysis of longitudinal data", volume 25 of Oxford Statistical Science Series. Oxford University Press, Oxford, 2 ed. (2002), 379 pp. suggest an additive decomposition of (i into serially correlated variation and measurement error. It is assumed that εi=ε1i+ε2i, where ε1i is a component of serial correlation, which is usually a decreasing function of the time separation, and ε2i is an extra component of measurement error reflecting variation added by the measurement process itself. According to the same authors, parsimonious choice of the covariance structure can improve the efficiency of inferences made about mean structure and obtain better estimates of standard errors of estimated mean parameters. An additional property of the linear mixed model is related to the covariance of the response profile, which can be described in terms of a set of covariance parameters in both matrix Σ and R i1212. G. Fitzmaurice, M. Davidian, G. Verbeke & M. Geert. "Longitudinal Data Analysis. Handbooks of Modern Statistical Methods". Chapman & Hall/CRC, New York (2009).. The most commonly used covariance structures are shown below:

  • Compound symmetry structure, i.e. the (k, l)-th entry of Ri is σ2ρ1kl , for some ρ1-, 1 , where 1(.) is an indicator function, which in this case takes value 1 when kl and value 0 when k=l ;

  • Autoregressive structure of order 1 (AR(1)), i.e. the (k, l)-th entry of R i is given by σ2ρk-l , for some ρ-1, 1 . This structure is widely used for fitting models to data sets with equally spaced longitudinal observations on the same units of analysis 3333. B.T. West, K.B. Welch & A.T. Galecki. "Linear Mixed Models: A Practical Guide Using Statistical Software". Chapman & Hall, Boca Raton, 2 ed. (2014).;

  • Toeplitz structure, which specifies that covariance depends only on lag, i.e. the (k, l)-th entry of Ri is given by σkl1kl or Ri=σ21k=l otherwise;

  • Exponential decay structure, i.e. the (k, l)-th entry of Ri is given by σ2 exp-k-l/r , where r>0 is the constant range parameter.

The number of parameters of the covariance matrix depends on its structure. This generality, however, brings the obvious disadvantage of having a very large number of parameters 2323. R.C. Littell, J. Pendergast & R. Natarajan. Modelling covariance structure in the analysis of repeated measures data. Statistics in Medicine , 19 (2000), 1793-1819.. In this sense, it is necessary to determine the best structure of the covariance matrix in the data modeling. However, there are no general simple techniques available to compare all these models. 77. P.J. Diggle, P.J. Heagerty, K.Y. Liang & S.L. Zeger. "Analysis of longitudinal data", volume 25 of Oxford Statistical Science Series. Oxford University Press, Oxford, 2 ed. (2002), 379 pp. suggest to use the empirical variagram of the residuals as a tool for the selection of the covariance structure, especially when the data are irregularly spaced and unbalanced. According to 1212. G. Fitzmaurice, M. Davidian, G. Verbeke & M. Geert. "Longitudinal Data Analysis. Handbooks of Modern Statistical Methods". Chapman & Hall/CRC, New York (2009)., when data are highly unbalanced with many repeated measurements per subject, a reasonable strategy is to use a simple covariance structure for (i , once random effects can account for most of the variation in the data.

The simplest cases of linear mixed models are random intercept models, where there is a single normally distributed random effect. Random intercept models are sometimes referred to as variance components, error components, or random-effects models. Thus, in our context, a random intercept model for the response variable height can be expressed as follows:

Y 1 i t = β 10 + β 11 X i t 1 + β 12 X i t 2 + β 13 X i t 3 + β 14 X i t 4 + ω 10 i + ε 1 i t , (2.2)

where Y 1it is the height of the child i at time t, β 10 is the fixed intercept, β 11 is the gender effect, β 12 is the TDB effect, β 13 is the age effect, β 14 is the hemoglobin level effect, ω10i~N 0, σω102 is the random intercept and (1it is the error.

The mixed-effects model with random intercept and slope, for the response variable weight, can be expressed as:

Y 2 i t = β 20 + β 21 X i t 1 + β 22 X i t 2 + β 23 X i t 3 + β 24 X i t 4 + ω 20 i + ω 23 i X i t 3 + ε 2 i t , (2.3)

where Y 2it is the weight of the child i at time t, β 20 is the fixed intercept, β 21 is the gender effect, β 22 is the TDB effect, β 23 is the age effect, β 24 is the hemoglobin level effect, ω20i~N 0, σω202 and ω23i~N 0, σω232 are the (uncorrelated) random effects of the intercept and slope, respectively, and (2it is the error.

In this work, the AR(1) and compound symmetry covariance structures were assumed for the errors ε1i=ε1i1, ε1i2,..., ε1ini' and ε2i=ε2i1, ε2i2,..., ε2ini' of the linear mixed-effects models (2.2) and (2.3), respectively. These choices were made based on the values of the Akaike Information Criterion (AIC) 11. H. Akaike. On entropy maximization principle. In P.R. Krishnaiah (editor), "Applications of Statistics". North-Holland, Amsterdam (1993), pp. 27-41. and through the construction of the semi-variogram 77. P.J. Diggle, P.J. Heagerty, K.Y. Liang & S.L. Zeger. "Analysis of longitudinal data", volume 25 of Oxford Statistical Science Series. Oxford University Press, Oxford, 2 ed. (2002), 379 pp..

2.2 Multivariate Approach

A number of approaches to joint modeling of multivariate longitudinal data have been proposed in the statistical literature (see, e.g., 44. P. Catalano & L. Ryan. Bivariate latent variable models for clustered discrete and continuous outcomes. Journal of the American Statistical Association, 87 (1992), 651-658.), (66. D.R. Cox & N. Wermuth. Response models for mixed binary and quantitative variables. Biometrika, 79(3) (1992), 441-461.), (1212. G. Fitzmaurice, M. Davidian, G. Verbeke & M. Geert. "Longitudinal Data Analysis. Handbooks of Modern Statistical Methods". Chapman & Hall/CRC, New York (2009).), (1717. R.V. Gueorguieva & A. Agresti. A correlated probit model for joint modeling of clustered binary and continuous reponses. Journal of the American Statistical Association, 96(455) (2001), 1102-1112.), (2424. X. Liu, M.J. Daniels & B. Marcus. Joint models for the association of longitudinal binary and continuous processes with application to a smoking cessation trial. Journal of the American Statistical Association , 104(486) (2009), 429-438.. The analysis can be demanding because of the existence of correlations between multiple time-dependent responses repeated over time.

As an alternative approach, copulas have been widely used when the interest resides in modeling the dependence structure among variables. In practice, copulas have shown to be a good option when the assumption of multivariate normality of the data is doubtful. Moreover, they can capture nonlinear dependence and tail dependence, and have no constraints on the marginal distributions of random variables (great flexibility). Thus, copulas have been successfully applied in the areas of finance, actuarial science and biomedical studies 2020. H. Joe. "Dependence modeling with copulas", volume 134 of Monographs on Statistics and Applied Probability. Chapman & Hall, Boca Raton, FL (2014), 462 pp., as well as in engineering, in multivariate process control and hydrological modeling 3434. J. Yan. Multivariate modeling with copulas and engineering applications. In H. Pham (editor), "Handbook in Engineering Statistics". Springer-Verlag, New York (2006), pp. 973-990.), (1616. C. Genest & A.C. Favre. Everything you always wanted to know about copula modeling but were afraid to ask. Journal of Hydrologic Engineering, 12 (2007), 347-368.. In our work, the copula-based approach allows us to model dependence between the height and weight of 150 children in the study.

In probability theory and statistics, a copula C is a multivariate distribution function whose marginal distribution functions have their domains belonging to the real range [0, 1]. In the bivariate context, we have Cu, v=PUu, Vv, where U and V are both random variables uniformly distributed over the (0, 1) interval (i.e. U, V~U0, 1)) and can be originated from transformations (probability integral transforms) of any continuous random variables, say X and Y. In our paper, we have considered that X represents the heights of the children, while Y represents their weights, where X and Y have distribution functions F(x) and G(y), respectively.

Copulas can be formally defined as follows.

Definition 1 (Copula). A bivariate copula is a function C : 0 , 1 2 0 , 1 satisfying the following requirements:

  • Grounded: C u , 0 = C 0 , v = 0 ;

  • Uniform marginals: C u , 1 = u and C 1 , v = v ;

  • 2-increasing: C u 2 , v 2 - C u 2 , v 1 - C u 1 , v 2 + C u 1 , v 1 0 for all u 2 > u 1 and v 2 > v 1 .

Sklar’s theorem is the most important result about copulas. The bivariate version of this theorem is presented below.

Theorem 1.3030. A. Sklar. Fonctions de répartition à n dimensions et leurs marges. Publications de l'Institut de Statistique de l'Université de Paris, 8 (1959), 229-231.Let X and Y be two random variables with distribution functions F(x) and G(y), respectively, and joint distribution function H(x, y). Then, there exists a copula C such that, for allx, y, satisfiesHx, y=CFx, Gy. If F and G are continuous, then C is unique. The theorem above leads us to conclude that any bivariate distribution function can be constructed through the combination of its marginals by means of a copula 2626. R.B. Nelsen. "An introduction to copulas". Springer Series in Statistics. Springer, New York, 2 ed. (2006), 269 pp..

In Table 1, we can see the different copula functions, C(u, v), and their densities, c(u, v), which are used in this paper, as well as the copula association parameter (and its possible values), denoted by θ. For instance, Table 1 shows the bivariate Gaussian copula and Student’s t v copula, as well as their density functions, where Φ-1· and Tv-1· denote the quantile functions of the standard normal and Student’s t distribution with ν degrees of freedom, respectively, and Γ(.) is the gamma function.

Table 1:
Some copula functions, their densities and parameter space.

It is important to mention that the copula captures all the dependence information between the variables. The copula-based dependence measures have been studied for decades. E.g., 2929. B. Schweizer & E.F. Wolff. On nonparametric measures of dependence for random variables. Ann. Statist. , 9(4) (1981), 879-885. proved many mathematical properties for some copula-based dependence measures.

Another important definition is given below. It shows us how the copula densities presented in Table 1 (third column) were obtained.

Definition 2 (Copula density). If C u , v = 0 u 0 v 2 C s , t s t d t d s , for all u , v 0 , 1 2 , then C is said to be absolutely continuous and the copula density c can be defined by c u , v = 2 C u , v u v .

Linear correlation (or Pearson’s product-moment correlation) coefficient is most frequently used in practice as a measure of dependence. However, in general, it is not possible to construct a joint distribution of the margins with arbitrary linear correlation coefficient, once this measure does not completely determine the joint distribution. According to 1919. A. Ida, N. Ishimura & M. Nakamura. Note on the measures of dependence in terms of copulas. Procedia Economics and Finance, 14 (2014), 273-279., the population versions of Kendall’s tau (τ k ) and Spearman’s rho (ρ s ) can be represented in terms of copulas. Thus, we have chosen for this work measures of dependence that are copula-based. To illustrate measures of a form of dependence known as concordance, that are invariant under strictly monotone transformations of the random variables, let us consider the following definition.

Definition 3 (Concordance).(i) Two observations (x 1 , y 1) and (x 2 , y 2) are concordant ifx1x2andy1y2or ifx1x2andy1y2. An equivalent characterization isx1-x2y1-y2>0. The observations(x 1 , y 1) and (x 2 , y 2) are said to be discordant ifx1-x2y1-y2<0.

(ii) If C 1 and C 2 are copulas, we say that C 1 is less concordant than C 2 (or C 2 is more concordant than C 1 ) if C 1 u , v C 2 u , v for all u , v 0 , 1 2 .

Measures of concordance assume the maximum (minimum) value +1 (-1) if the support of the joint distribution function of X and Y contains only concordant (discordant) pairs. The measures Kendall’s tau and Spearman’s rho have been chosen in this work, and they can be expressed in terms of copulas as τK=τKX, Y=4 0,12Cu, vdCu, v-1 and ρS=ρSX, Y=12 0,12Cu, vdudv-3, respectively. Their expressions, corresponding to each copula model chosen here, can be observed in Table 2.

Table 2:
Some copula models and their measures of dependence.

Another important class of dependence measures in nonlinear context is tail dependence. According to 1010. P. Embrechts, A.J. McNeil & D. Straumann. Correlation and dependence in risk management: properties and pitfalls. In "Risk management: value at risk and beyond". Cambridge Univ. Press, Cambridge (2002), pp. 176-223., the concept of tail dependence relates to the amount of dependence in the right upper quadrant or left lower quadrant of a bivariate distribution, and it is relevant for the study of dependence between extreme values. The same authors emphasize that the tail dependence between two continuous random variables is a copula property and the amount of tail dependence is invariant under strictly increasing transformations. The definition of tail dependence is shown below.

Definition 4 (Tail dependence). Let C be the copula of X and Y.

The lower tail dependence parameter is λ L = λ L X , Y = lim t 0 + P Y G - 1 t | X F - 1 t = lim t 0 + C t , t t .

The upper tail dependence parameter is λ U = λ U X , Y = lim t 1 - P Y > G - 1 t | X > F - 1 t = lim t 1 - 1 - 2 t + C t , t 1 - t .

The expressions for the tail dependence parameters (or coefficients), corresponding to each copula model, can be seen in Table 2.

One class of multivariate distributions that enable modeling of extremes and other forms of non-normal dependence is the elliptical distributions. Further details on elliptical distributions can be found in 1111. K.T. Fang, S. Kotz & K.W. Ng. "Symmetric multivariate and related distributions", volume 36 of Monographs on Statistics and Applied Probability. Chapman and Hall, Ltd., London (1990), 220 pp.. Elliptical copulas, which are of great interest here, are simply the copulas of elliptical distributions and provide a rich source of multivariate distributions that share many of the tractable properties of the multivariate normal (or Student’s t) distribution. Among the elliptical copulas, we highlight the bivariate Gaussian and Student’s t copulas, whose main characteristics are presented in Tables 1 and 2. Notice from Table 2 that the Gaussian copula can not accommodate tail dependence, while the Student’s t copula has lower and upper tail dependence of the same magnitude.

Another important and interesting parametric family of copulas is Archimedean. This class allows a great variety of dependence structures. Furthermore, in contrast to elliptical copulas, all commonly encountered Archimedean copulas have closed-form expressions 1010. P. Embrechts, A.J. McNeil & D. Straumann. Correlation and dependence in risk management: properties and pitfalls. In "Risk management: value at risk and beyond". Cambridge Univ. Press, Cambridge (2002), pp. 176-223.. The main bivariate Archimedean copulas (Clayton copula 55. D.G. Clayton. A model for association in bivariate life tables and its application in epidemiological studies of familial tendency in chronic disease incidence. Biometrika, 65(1) (1978), 141-151., Frank copula 1313. M.J. Frank. On the simultaneous associativity of F(x; y) and x + y - F(x; y). Aequationes Mathematicae, 19(2-3) (1979), 194-226. and Gumbel copula 1818. E.J. Gumbel. Bivariate exponential distributions. Journal of the American Statistical Association , 55 (1960), 698-707.) are shown in Table 1. The Kendall’s tau and Spearman’s rho measures for these Archimedean copulas are presented in Table 2, except for the Clayton and Gumbel copulas, which have no closed-form expressions for the Spearman’s rho. Note from Table 2 that the Gumbel copula is able to model upper tail dependence, whereas the Clayton copula can model lower tail dependence and the Frank copula does not accommodate tail dependence.

The parametric families of copula that were presented here, are important when selecting a good candidate copula model that summarizes the dependence structure between the height and weight of 150 studied children, according to the chosen measures of concordance.

3 INFERENCE

In this section, we discuss inference (point and interval estimation) for the parameters of the proposed univariate and bivariate longitudinal data models.

3.1 Point Estimation

The likelihood function for the linear mixed-effects model (2.1) is given by

L δ | y = i = 1 m f y i | ω i , β , R i , Σ f * ω i | Σ d ω i , (3.1)

where δ=β, R, Σ is the set of parameters to be estimated, with R being the i=1m ni×i=1m ni block diagonal matrix whose elements are R i , for i=1, 2,..., m; y=y11,..., ymnm is the i=1m ni×1 vector of all observed responses, with yi=yi1,..., yini' being the ni×1 vector of observed responses for child i; fyi|ωi, β, Ri, Σ represents the conditional density of response vector Y i at point y i given ω i ; and f*ωi|Σ is the joint density of ω i . It follows from Section 2.1 that Yi|ωi~NniXiβ+Ziωi, Ri and ωi~Nq0, Σ, where Z i is the ni×q matrix defined as Zi=Zi0 Zi1 ... Ziq-1. Notice that (3.1) is the marginal density function of y, which is obtained by integrating the joint density of y and ω over ω, where ω=ω1',..., ωm''.

In order to find the maximum likelihood estimates (MLEs) for the parameters of model (2.1), we set the score function to zero (this function is defined as the first order partial derivative of the logarithm of likelihood function (3.1) with respect to δ). However, it is not always possible to find closed-form expressions for these estimators. Therefore, the use of iterative methods is often needed.

Nevertheless, the usual method of maximum likelihood estimation provides biased estimates for the variance components, in the presence of random effects. An alternative is to use the restricted maximum likelihood (RML) estimation method (sometimes called residual maximum likelihood estimation). According to 77. P.J. Diggle, P.J. Heagerty, K.Y. Liang & S.L. Zeger. "Analysis of longitudinal data", volume 25 of Oxford Statistical Science Series. Oxford University Press, Oxford, 2 ed. (2002), 379 pp., the RML method is used for unbiased estimation of the variance components in a generalized linear mixed model. This approach consists of dividing the observations into two independent parts: one refers to the fixed effects and the other to the random effects, such that the observed probability density function is obtained by summing those parts. In this work, we perform the RML estimation method for the proposed linear mixed-effects models (2.2) and (2.3), by using the lme function from the nlme package of R software 2727. R Core Team. R: A language and environment for statistical computing (version 3.3.2). R Foundation for Statistical Computing, Vienna, Austria (2016). URL https://www.R-project.org/.
https://www.R-project.org/...
.

The log-likelihood function for the bivariate copula-based linear mixed-effects models, on the other hand, is given by the composition of Eq. (3.1) and the third column of Table 1, as follows:

l η = i = 1 m t = 1 n i log c u i t , v i t | θ + j = 1 2 log L j δ j | y j , (3.2)

where η=δ1, δ2, θ is the set of all model parameters, δj=βj, Rj, Σj is the margin j’s parameter set (here, j=1 refers to the linear mixed-effects model (2.2), while j=2 corresponds to the linear mixed-effects model (2.3)), yj=yj11,..., yjmnm is the margin j’s i=1m ni×1 vector of all observed responses, uit=Fy1it|δ1 and vit=Gy2it|δ2 represent the cdf for the linear mixed-effects models (2.2) and (2.3), respectively. Here, R 1 has the AR(1) structure and R 2 exhibits the compound symmetry pattern (see Section 2.1). Moreover, Lj=δj|yj denotes the likelihood function (3.1) for margin j’s model.

Regarding estimation of parameters in copula-based models, we can use a fully parametric estimation method called Inference Function for Margins (IFM) by 2121. H. Joe & J. Xu. The estimation method of inference functions for margins for multivariate models. Technical Report 166, Department of Statistics, University of British Columbia (1996).. In their work, the authors highlight some advantages of this two-stage maximum likelihood estimation approach: (i) the IFM method is very useful for many multivariate models computationally unfeasible (at first glance); (ii) it allows one to make inference and modeling, starting with univariate and lower-dimensional margins; (iii) there is some robustness against misspecification of the dependence structure, as well as more robustness against outliers or perturbations of the data, compared with the maximum likelihood method; (iv) sparse multivariate data can create problems for the maximum likelihood method, but the IFM method avoids the sparseness problem to a certain degree, especially if all parameters can be estimated from univariate and bivariate likelihoods (this can be a major advantage in a smaller sample situation).

In the first step, the IFM method estimates the marginal parameters δj , j=1, 2, through

δ ^ j = arg max δ j log L j δ j | y j .

Then, we obtain u^it=Fy1it|δ^1 and v^it=Gy2it|δ^2, in order to estimate the association parameter θ by using the pseudo log-likelihood as follows:

θ ^ = arg max θ i = 1 m t = 1 n i log c u ^ i t , v ^ i t | θ .

Moreover, we consider time-varying bivariate copula-based linear models for fitting the child growth data in the sense that the copula association parameter is not constant, i.e. it varies with time. The reason for this is that the association (or dependence) between the two variables of interest (height and weight) may vary across time, as well as the covariate effects on them and variance components. For further details on time-varying copulas, we refer the reader e.g. to the survey paper by 2525. H. Manner & O. Reznikova. A survey on time-varying copulas: specification, simulations, and application. Econometric Reviews, 31(6) (2012), 654-687..

The log-likelihood function for the time-varying bivariate copula-based linear models can be written as follows:

l ξ = i = 1 m t = 1 n i log c u i t , v i t | θ t + i = 1 m t = 1 n i j = 1 2 log f j t y j i t | δ j t , (3.3)

where ξ=δ1', δ2', θ'' is the set (vector) of all model parameters; δj=δj1',..., δjni'', with δjt=βjt', σjt2', is the margin j’s parameter vector; β jt and σjt2 are, respectively, the vector of regression coefficients and the variance for margin j at time point t (time-varying coefficients and variance); θ=θ1,..., θni' is the vector of copula parameters (time-varying association parameter); fjtyjit|δjt represents the density of response Y jit at point y jit ; finally, uit=Fty1it|δ1t and vit=Gty2it|δ2t represent, respectively, the cdf for the linear models in margins 1 and 2 at time point t. Note that, in this case, we fit a linear model to each margin at each time point, i.e. we consider the (classical) multiple linear regression model: Yjit=Xjit'βjt+εjit, where εjit~N0, σjt2, for j=1, 2, i=1,..., m and t=1,..., ni. This implies that Yjit~NXjit'βjt, σjt2.

For model estimation, the log-likelihood function form given by (3.3) also enables the use of the IFM method, which estimates the marginal parameters δjt , for j=1, 2 and t=1,..., ni, at a first step through

δ ^ j t = arg max δ j t i = 1 m log f j t y j i t | δ j t

and then estimates the association parameters θt , t=1,..., ni, given δ^jt by

θ ^ t = arg max θ t i = 1 m log c F t y 1 i t | δ ^ 1 t , G t y 2 i t | δ ^ 2 t | θ t .

We perform the IFM estimation method for the proposed bivariate copula-based linear mixed-effects models and time-varying bivariate copula-based linear models, by using the optim function (method “L-BFGS-B” by 22. R.H. Byrd, P. Lu, J. Nocedal & C.Y. Zhu. A limited memory lgorithm for bound constrained optimization. SIAM J. Sci. Comput., 16(5) (1995), 1190-1208.) in R.

3.2 Interval Estimation

There are no analytical results readily available for obtaining the standard errors of the IFM estimates for the parameters of the bivariate copula-based linear mixed-effects models and time-varying bivariate copula-based linear models proposed in this paper. We then resorted to using a bootstrap resampling procedure in order to obtain confidence intervals for the models’ parameters. However, the basic nonparametric bootstrap would not perform well here (mainly for the case of bivariate copula-based linear mixed-effects models) because of the increased dependence due to the presence of repeated observations in each of the resampled datasets. We chose, therefore, to run a parametric bootstrap procedure, which samples from the model using the (point) IFM estimates as the true values of the parameters. We performed a total of 1000 bootstrap samples for each proposed model. Then, we built 95% bootstrap confidence intervals by using the percentile method by 99. B. Efron & R.J. Tibshirani. "An introduction to the bootstrap", volume 57 of Monographs on Statistics and Applied Probability. Chapman and Hall, New York (1993), 436 pp.. Besides its simplicity, which is the main attraction of this bootstrap method according to 33. J. Carpenter & J. Bithell. Bootstrap confidence intervals: when, which, what? A practical guide for medical statisticians. Statistics in Medicine, 19 (2000), 1141-1164., no invalid parameter values can be included in the obtained intervals.

4 APPLICATION TO LONGITUDINAL CHILD GROWTH DATA

In this section, we show first the main results of an exploratory analysis of the child growth data (Section 4.1), which were drawn from a longitudinal study that was conducted with 150 children born in a maternity hospital in the municipality of Mutuípe, Bahia, from June 2005 to May 2006. Then, we present and discuss the main estimation results obtained via the univariate approach, i.e. with the separate linear mixed-effects models’ fittings (Section 4.2), and also by the bivariate approach, i.e. through the proposed copula-based modelings (Section 4.3).

4.1 Descriptive Analysis

The children were measured monthly, from birth to six months of life. Information on weight (in g), height (in cm), hemoglobin level in the blood (in g/dl) and TDB (in months) were collected repeatedly. At first, an exploratory data analysis was performed to verify whether the children’s height and weight are changing in a similar or different fashion. Thus, we observe, among others, that the average birth height was 48.3 cm, ranging from 42.1 cm to 52.5 cm, and the average birth weight was 3181.0 g, ranging from 2230.0 g to 4500.0 g. Moreover, in the last month of the study, about 25% of the children measured up to 64.1 cm and weighed up to 6909.0 g. The average follow-up time was 3.8 months, ranging from 2 to 6 months.

In studies with two longitudinal outcomes, it is useful to understand the strength of the relationship between them and the pattern of correlation across time.

Figure 1 shows the correlation between the children’s height and weight over the 6 months period (corrgram). Among others, it suggests a joint analysis of both longitudinal outcomes (which are performed and discussed in Section 4.3), in order to understand growth and weight gain patterns over time.

Figure 1:
Corrgram for child growth data, where w t and h t denote, respectively, the weight and height at time t, t=0, 1,..., 6.

As mentioned in Section 2.1, the choice of covariance matrices of the models was made on the basis of AIC values and by the construction of semi-variograms. Figure 2, left and right panels, show the associations between repeated observations, at time intervals, for height and weight, respectively. The horizontal line represents the estimation of the process variance. As can be seen from the right panel of Figure 2, the correlation decreases throughout the period considered, whereas from the left panel of this figure, we observe that the correlation first decreases, but it increases again in the last time points.

Figure 2:
Semi-variogram for height and weight data.

Furthermore, graphical methods can be used to explore the magnitude of between-person variability in outcomes over time. For instance, individual line plots for each study participant allow inspection of the individual response patterns and whether there is strong heterogeneity in the trajectories. Figure 3 shows the individual profile of the children’s height (left panel) and weight (right panel). The black solid lines represent the mean response profile for height and weight, against time/age (trend lines). Notice that, for both longitudinal outcomes, there exist a lower variability at the beginning and a greater variability at the end of the study (after some months of life), which is most notable for the weight variable. In addition to this, there is a greater within-person variability in the weight profile than in the height one, since the change in each individual line (slope) is more evident for the former than the latter. Thus, this descriptive analysis was essential for guiding our modeling strategy, indicating different trajectories with possibly different slopes for the weight outcome.

Figure 3:
Longitudinal distribution of children’s height and weight. Mutuípe, Bahia, 2005-2006.

4.2 Separate (Univariate) Linear Mixed-Effects Models

Table 3 shows the main estimation results for the linear mixed-effects models (2.2) and (2.3) when fitted separately (univariate analysis). It can be seen that the baby girls’ growth is lower than that of baby boys (on average, 1.439 cm smaller). There is also a significant decrease of the hemoglobin level in the mean height corresponding to an increase in age (on average, 0.396 cm smaller). Finally, TDB does not contribute significantly to the growth pattern across time (p-value >0.05). The estimation results for model (2.3) are similar to those for model (2.2), in terms of the contribution and direction of the effect of each covariate. The only exception is the positive effect of TDB (at 5% level of significance), revealing a significant increase of the TDB in the mean weight corresponding to an increase in age (on average, 59.228 g heavier). A random intercept for the growth pattern means that there is some average height in the population. Thus, the covariance structure considered in this analysis was constant and positive over time (compound symmetry or exchangeable correlation structure). For the weight model, the random intercept and age mean that the individual’s weight development varies not only in its initial stage, but its rate of change also varies because there are both intra- and inter-individual differences. The AIC and BIC (Bayesian Information Criterion; see 2828. G. Schwarz. Estimating the dimension of a model. Ann. Statist., 6(2) (1978), 461-464.) values for each fitted model are also presented in Table 3. The overall AIC and BIC values, which are used for comparing the results from the univariate approach with the ones obtained from the bivariate approach (see Section 4.3), are calculated by simply summing their individual values, i.e. AICTotal=12140.400+2924.868=15065.268 and BICTotal=12182.610+2962.385=15144.995.

Table 3:
RML estimation results for the (univariate) linear mixed-effects models (2.2) and (2.3), fitted to the longitudinal child growth data. SE - standard error.

We assessed the models’ adequacy by analyzing the conditional residuals of each fitted model. The conditional residual is given by the difference between the observed data and the predicted value of the observation, i.e. eit=Yit-Xit'β^-Zit'ω^i, for i=1,... m and t=1,... ni, where β^ and ω^i are, respectively, the RML estimates of β and ω i . Figures 4 and 5 show diagnostic plots (normal quantile plots of the standardized conditional residuals with simulated 95% confidence envelopes, and scatter-plots of the standardized conditional residuals versus fitted values) for models (2.2) and (2.3), respectively. The standardized conditional residual is defined as eit*=eit / r^it, where r^it is the (t, t)-th entry of R^i, which is the RML estimate of R i . These figures indicate the suitability of the proposed linear mixed-effects models for the height and weight outcomes, since we observe that the normality assumption of standardized conditional residuals is valid (see Figures 4 and 5’s left panels) and, in general, there are no standardized conditional residuals with high values (see Figures 4 and 5’s right panels).

Figure 4:
Diagnostic plots for the height model.

Figure 5:
Diagnostic plots for the weight model.

4.3 Bivariate Copula-based Longitudinal Data Models

1515. C. Genest & J.C. Boies. Detecting dependence with Kendall plots. The American Statistician, 57(4) (2003), 275-284. proposed a graphic tool, called Kendall’s plot (or K-plot), for detecting dependence in multivariate data. Figure 6 shows the Kendall’s plot, per month, built from the longitudinal child growth data, which indicates a considerable positive (and of approximately the same magnitude) association between the height and weight outcomes.

Figure 6:
The Kendall’s plot, per month, corresponding to the bivariate data at hand.

Among the bivariate copula models that allow for positive association between the margins, we highlight the Gaussian, Student’s t with arbitrary degrees of freedom (e.g. ν=2), Clayton, Frank and Gumbel copulas. Table 4 exhibits the estimation results for these bivariate copula-based linear mixed-effects models fitted to the longitudinal Brazilian child growth data. The results include the 95% percentile bootstrap confidence intervals for the copula association parameter (θ), the corresponding measures of dependence (Kendall’s tau, Spearman’s rho, lower and upper tail dependence coefficients), and the AIC and BIC values. Here, we focus on the copula association parameter estimate (θ^), since the marginal parameter estimates are the same presented in Table 3 (we employed the IFM estimation method described in Section 3.1). It can be clearly seen from Tables 3 and 4 that the bivariate copula-based approach overcomes the univariate one (all the copula-based models have smaller AIC and BIC values than the univariate models), justifying joint estimation of the height and weight’s regression models through copulas to improve statistical efficiency. Table 4 also shows us that the bivariate linear mixed-effects model based on Gumbel copula was selected as the best one, according to both considered criteria (it has the smallest values on these criteria: AIC=14429.37 and BIC=14513.89). For this best model, the estimated dependence measures (τ^K=0.789 and ρ^S=0.937) reveal a strong positive relationship between the margins (height and weight’s regression models), and the estimated coefficient of tail dependence (λ^U=0.842) shows the strongest positive relationship at the upper tail of the joint distribution, i.e. for high heights and weights.

Table 4:
IFM estimation results for the bivariate copula-based linear mixed-effects models fitted to the data at hand, with a focus on the copula association parameter. LL - loglikelihood.

Table 5 shows the IFM estimation results for the time-varying bivariate copula-based linear models’ association parameters (θ0, θ1,..., θ6), as well as their 95% percentile bootstrap confidence intervals (the results for the time-varying coefficients and standard deviation are not shown here). The estimated dependence measures for each of these models are presented in Table 6. It can be seen that the best bivariate model is the one based on Gaussian copula, which obtained the lowest AIC and BIC values (also presented in Table 5). Note, however, that it does not surpass the bivariate Gumbel copula-based linear mixed-effects model (with constant association parameter θ) selected before (see Table 4), according to both considered criteria. The reason for this may also be due to the fact that there is an intersection among the 95% percentile intervals for the time-varying bivariate Gaussian copula-based linear model’s association parameters, indicating no gain in considering/assuming non-constant dependence between the children’s height and weight over time. Furthermore, notice that, according to these criteria, none of the time-varying bivariate copula-based linear models overcome the linear mixed-effects models based on bivariate copulas with constant association parameter θ (see Tables 4 and 5).

Table 5:
IFM estimates and 95% percentile bootstrap confidence intervals for the time-varying bivariate copula-based linear models’ association parameters.

Table 6:
Estimated dependence measures for the time-varying bivariate copula-based linear models.

5 FINAL REMARKS AND FURTHER RESEARCH

Linear mixed-effects models are very popular in practice, since they are easy to handle and interpret. However, in this work such models have shown to be of limited use because of the impossibility of jointly modeling children’s height and weight over time, also taking into account the correlation between the response variables.

There are also, on the other hand, some limitations to the use of copula models. The added complexity may render the estimation procedure more cumbersome, resulting in increased difficulty in finding convergent estimates. Moreover, although there are specific uses and guidelines for choosing between different types of copula models, an extra step of selecting between candidate models is made necessary when using the proposed approach.

Future research may focus on developing copula models even further. Novel strategies that incorporate cross-timed dependencies can be devised, in an attempt to model the dependence all past values of one variable has on the other variable, for instance. A regression model on the copula parameter may also provide further insight into the relationship within the data, providing information on how the association between variables is influenced by the covariates. In addition to this, other challenge to model dependence structure would be how to handle missing values without being forced to discard data.

ACKNOWLEDGMENTS

The authors would like to thank the Editorial Board and the reviewers for their valuable comments and suggestions which helped to improve the manuscript.

REFERENCES

  • 1
    H. Akaike. On entropy maximization principle. In P.R. Krishnaiah (editor), "Applications of Statistics". North-Holland, Amsterdam (1993), pp. 27-41.
  • 2
    R.H. Byrd, P. Lu, J. Nocedal & C.Y. Zhu. A limited memory lgorithm for bound constrained optimization. SIAM J. Sci. Comput., 16(5) (1995), 1190-1208.
  • 3
    J. Carpenter & J. Bithell. Bootstrap confidence intervals: when, which, what? A practical guide for medical statisticians. Statistics in Medicine, 19 (2000), 1141-1164.
  • 4
    P. Catalano & L. Ryan. Bivariate latent variable models for clustered discrete and continuous outcomes. Journal of the American Statistical Association, 87 (1992), 651-658.
  • 5
    D.G. Clayton. A model for association in bivariate life tables and its application in epidemiological studies of familial tendency in chronic disease incidence. Biometrika, 65(1) (1978), 141-151.
  • 6
    D.R. Cox & N. Wermuth. Response models for mixed binary and quantitative variables. Biometrika, 79(3) (1992), 441-461.
  • 7
    P.J. Diggle, P.J. Heagerty, K.Y. Liang & S.L. Zeger. "Analysis of longitudinal data", volume 25 of Oxford Statistical Science Series. Oxford University Press, Oxford, 2 ed. (2002), 379 pp.
  • 8
    F. Domma, S. Giordano & P.F. Perri. Statistical modeling of temporal dependence in financial data via a copula function. Comm. Statist. Simulation Comput., 38(3-5) (2009), 703-728.
  • 9
    B. Efron & R.J. Tibshirani. "An introduction to the bootstrap", volume 57 of Monographs on Statistics and Applied Probability. Chapman and Hall, New York (1993), 436 pp.
  • 10
    P. Embrechts, A.J. McNeil & D. Straumann. Correlation and dependence in risk management: properties and pitfalls. In "Risk management: value at risk and beyond". Cambridge Univ. Press, Cambridge (2002), pp. 176-223.
  • 11
    K.T. Fang, S. Kotz & K.W. Ng. "Symmetric multivariate and related distributions", volume 36 of Monographs on Statistics and Applied Probability. Chapman and Hall, Ltd., London (1990), 220 pp.
  • 12
    G. Fitzmaurice, M. Davidian, G. Verbeke & M. Geert. "Longitudinal Data Analysis. Handbooks of Modern Statistical Methods". Chapman & Hall/CRC, New York (2009).
  • 13
    M.J. Frank. On the simultaneous associativity of F(x; y) and x + y - F(x; y). Aequationes Mathematicae, 19(2-3) (1979), 194-226.
  • 14
    E.W. Frees & P. Wang. Copula credibility for aggregate loss models. Insurance: Mathematics & Economics, 38(2) (2006), 360-373.
  • 15
    C. Genest & J.C. Boies. Detecting dependence with Kendall plots. The American Statistician, 57(4) (2003), 275-284.
  • 16
    C. Genest & A.C. Favre. Everything you always wanted to know about copula modeling but were afraid to ask. Journal of Hydrologic Engineering, 12 (2007), 347-368.
  • 17
    R.V. Gueorguieva & A. Agresti. A correlated probit model for joint modeling of clustered binary and continuous reponses. Journal of the American Statistical Association, 96(455) (2001), 1102-1112.
  • 18
    E.J. Gumbel. Bivariate exponential distributions. Journal of the American Statistical Association , 55 (1960), 698-707.
  • 19
    A. Ida, N. Ishimura & M. Nakamura. Note on the measures of dependence in terms of copulas. Procedia Economics and Finance, 14 (2014), 273-279.
  • 20
    H. Joe. "Dependence modeling with copulas", volume 134 of Monographs on Statistics and Applied Probability. Chapman & Hall, Boca Raton, FL (2014), 462 pp.
  • 21
    H. Joe & J. Xu. The estimation method of inference functions for margins for multivariate models. Technical Report 166, Department of Statistics, University of British Columbia (1996).
  • 22
    P. Lambert & F. Vandenhende. A copula-based model for multivariate non-normal longitudinal data: Analysis of a dose titration safety study on a new antidepressant. Statistics in Medicine , 21 (2002), 3197-3217.
  • 23
    R.C. Littell, J. Pendergast & R. Natarajan. Modelling covariance structure in the analysis of repeated measures data. Statistics in Medicine , 19 (2000), 1793-1819.
  • 24
    X. Liu, M.J. Daniels & B. Marcus. Joint models for the association of longitudinal binary and continuous processes with application to a smoking cessation trial. Journal of the American Statistical Association , 104(486) (2009), 429-438.
  • 25
    H. Manner & O. Reznikova. A survey on time-varying copulas: specification, simulations, and application. Econometric Reviews, 31(6) (2012), 654-687.
  • 26
    R.B. Nelsen. "An introduction to copulas". Springer Series in Statistics. Springer, New York, 2 ed. (2006), 269 pp.
  • 27
    R Core Team. R: A language and environment for statistical computing (version 3.3.2). R Foundation for Statistical Computing, Vienna, Austria (2016). URL https://www.R-project.org/
    » https://www.R-project.org/
  • 28
    G. Schwarz. Estimating the dimension of a model. Ann. Statist., 6(2) (1978), 461-464.
  • 29
    B. Schweizer & E.F. Wolff. On nonparametric measures of dependence for random variables. Ann. Statist. , 9(4) (1981), 879-885.
  • 30
    A. Sklar. Fonctions de répartition à n dimensions et leurs marges. Publications de l'Institut de Statistique de l'Université de Paris, 8 (1959), 229-231.
  • 31
    J. Sun, E.W. Frees & M.A. Rosenberg. Heavy-tailed longitudinal data modeling using copulas. Insurance Mathematics & Economics, 42 (2008), 817-830.
  • 32
    G. Verbeke , S. Fieuws, G. Molenberghs &M. Davidian. The analysis of multivariate longitudinal data: a review. Statistical Methods in Medical Research, 23(1) (2014), 42-59.
  • 33
    B.T. West, K.B. Welch & A.T. Galecki. "Linear Mixed Models: A Practical Guide Using Statistical Software". Chapman & Hall, Boca Raton, 2 ed. (2014).
  • 34
    J. Yan. Multivariate modeling with copulas and engineering applications. In H. Pham (editor), "Handbook in Engineering Statistics". Springer-Verlag, New York (2006), pp. 973-990.

Publication Dates

  • Publication in this collection
    30 May 2019
  • Date of issue
    Jan-Apr 2019

History

  • Received
    01 Jan 2018
  • Accepted
    25 Sept 2018
Sociedade Brasileira de Matemática Aplicada e Computacional Rua Maestro João Seppe, nº. 900, 16º. andar - Sala 163 , 13561-120 São Carlos - SP, Tel. / Fax: (55 16) 3412-9752 - São Carlos - SP - Brazil
E-mail: sbmac@sbmac.org.br