Acessibilidade / Reportar erro

The Asymmetric Power-Student-t Model for Censored and Truncated Data

Abstract

In this paper, we propose the power Student-t regression model for censored (limited) observations which extends the Student-t censored regression model. This extension is based on the asymmetric and heavy-tailed power Student-t distribution. The score functions and expected information matrix are given as well as the process for estimating the parameters in the model is discussed by using the likelihood approach. Two simulation studies are conducted to evaluate parameter recovery and properties of the model and finally, two applications to a real data set are reported to demonstrate the usefulness of this new methodology.

Key words
Censored regression model; Fisher information matrix; maximum likelihood estimation; power Student-$t$ distribution

INTRODUCTION

Regression models where the response variable is censored or limited are common in different fields: clinical essays, econometric analysis, social phenomena, engineering studies, among others. In clinical essays for example, in the first phases of development of the new vaccines, the determination of antibody concentration values often are left-censored due to detection limit by lack of sensitivity of the essay when the concentrations are near zero, see Moulton & Halsey 1995MOULTON LH & HALSEY NA. 1995. A Mixture Model With Detection Limits for Regression Analyses of Antibody Response to Vaccine. Biometrics 51: 1570–1578. . In social phenomena, the study on extramarital behavior where the variable of interest is the number of extramarital affairs in the previous year, for example, it can result in a left-censored variable (Fair 1978FAIR RC. 1978.A theory of extramarital affairs. J Polit Econ 86(1): 45–61. ). In econometrics analysis, the ordinary Tobit model (Tobin 1958TOBIN J. 1958.Estimation of relationship for limited dependent variables. Econometrica 26(1): 24–36. ) is commonly used to conduct studies of the labor force participation of married women. In this case, the observed response is the wage rate, which is typically considered as censored below zero, i.e., for working women, positive values for the wage rates are registered, whereas for the non-working women the observed wage rates are zero; see Mroz 1987MROZ TA. 1987. The Sensitivity of an Empirical Model of Married Women’s Hours of Work to Economic and Statistical Assumptions. Econometrica 55(4): 765–799. doi:10.2307/1911029. .

In situations such as previously discussed, where censored regression (CR) models are proposed, it is common to assume a normal distribution for the error term, however, this assumption can not be suitable and it can be unrealistic due to the presence of atypical observations or high (or low) degree of skewness and kurtosis of the response variable, which the normal model is unable to capture, so considerable interest has centered on relaxing the assumption of normality of the errors in CR models. In this context, some authors have proposed a wide range of alternatives to the normal censored regression (NCR) model which is widely known in the literature as the Tobit model. Arellano-Valle et al. 2012ARELLANO-VALLE RB, CASTRO LM, GONZÁLEZ-FARÍAS G & MUNÕZ-GAJARDO KA. 2012. Student-t Censored Regression Model: Properties and Inference. Stat Methods Appt 21(4): 453–473. doi:10.1007/s10260-012-0199-y.
10.1007/s10260-012-0199-y...
for example, extend the classical Tobit model by introducing the Student-t censored regression (TCR) model that can be suitable when the response variable has heavy-tails and the kurtosis is greater than the usual normal model. Another extension of the Tobit model was proposed by Martínez-Flórez et al. 2013MARTÍNEZ-FLÓREZ G, BOLFARINE H & GÓMEZ HW. 2013. The alpha–power tobit model. Commun Stat Theory Methods 42(4): 633–643. by considering that random errors follow a power-normal (PN) distribution (Gupta & Gupta 2008GUPTA RD & GUPTA RC. 2008. Analyzing skewed data by power–normal model. Test 17: 197–210. ). The novelty of this proposal is the incorporation of a shape parameter which gives flexibility to the assumption of the symmetric errors (normality assumption) and it allows to accommodating skewed forms to the left and the right for the error term in CR models. Recently Garay et al. 2017GARAY AM, LACHOS VH, BOLFARINE H & CABRAL CRB. 2017. Linear Censored Regression Models with Scale Mixtures of Normal Distributions. Stat Pap 58(1): 247–278. doi:10.1007/s00362-015-0696-9. proposed a family of censored regression models based on the family of symmetric distributions commonly known as the scale mixture of normal (SMN) distributions, which includes the TCR model proposed by Arellano-Valle et al. 2012ARELLANO-VALLE RB, CASTRO LM, GONZÁLEZ-FARÍAS G & MUNÕZ-GAJARDO KA. 2012. Student-t Censored Regression Model: Properties and Inference. Stat Methods Appt 21(4): 453–473. doi:10.1007/s10260-012-0199-y.
10.1007/s10260-012-0199-y...
. This family also includes Pearson type VII (PVII), slash (SL), power exponential (PE), contaminated normal (CN) and normal (N) distributions. In addition to being robust, these models have shown to be useful in detecting atypical observations in CR models.

Although some proposals that take into account the problem of atypical observations in censored regression models, most of them are based on the assumption of symmetry of the error and few studies that capture departure from symmetry in the distribution of errors as in Martínez-Flórez et al. 2013MARTÍNEZ-FLÓREZ G, BOLFARINE H & GÓMEZ HW. 2013. The alpha–power tobit model. Commun Stat Theory Methods 42(4): 633–643. , for example, who support their work in the great virtues of the alpha-power models to fit data where distribution presents high or low asymmetry and/or kurtosis.

Within this class of alpha-power models Zhao & Kim 2016ZHAO J & KIM HM. 2016.Power t distribution. Ommun Stat Appl Methods 23(4): 321–334. proposed an extension of the Student-t model by defining the power-Student-t (PT) distribution as an alternative to the skew-t model by Azzalini & Capitanio 2003AZZALINI A & CAPITANIO A. 2003. Distributions generated by perturbation of symmetry with emphasis on a multivariate skew t-distribution. J R Stat Soc Series B Stat Methodol 65(2): 367–389. for fitting skewed and heavy-tailed data. The PT model, which extends the power-normal model by Gupta & Gupta 2008GUPTA RD & GUPTA RC. 2008. Analyzing skewed data by power–normal model. Test 17: 197–210. seems to be useful in situations where the data present higher degree of skewness and kurtosis than PN model in presence of atypical observations.

In this paper, we propose a censored regression model under the assumption that errors follow a PT distribution (hereafter we will call it the PTCR model). The assumption of PT distribution gives flexibility for accommodating skew forms to the left and the right, and kurtosis greater or smaller than the Student t-distribution can be also accommodated, hence, PTCR model extends the TCR model. The process of inference in the model is conducted by using the maximum likelihood (ML) approach and its large sample properties. Application is implemented to real data set where it is demonstrated that the proposed model can be very useful in fitting real data sets.

The rest of this paper is organized as follows: Section “The Power Student-t Distribution“ presents a brief review of the main properties of the PT distribution. In Section “Power-Student-t Model for Censored and Truncated Data“, we introduce the censored and truncated PT models. Section “Censored Power-Student-t Regression Model“ introduces the PTCR model. Here, ML equation and the observed and expected information matrices are given. Section “Simulation Study“ presents the results of a simulation study which reveals the good performance of the estimation approach. The PTCR model is fitted to a data set of housewives wages in Section “Real Data Application“, revealing that the data set in question can be fitted by PTCR as well as by a CR model where the observational errors have a SMN distribution (SMNCR model).

THE POWER STUDENT-T DISTRIBUTION

In this section, we present the PT distribution and review some of its main characteristics and properties. The PT model was introduced by Zhao & Kim 2016ZHAO J & KIM HM. 2016.Power t distribution. Ommun Stat Appl Methods 23(4): 321–334. and it is an alternative to skew-t model for fitting data with high indices of asymmetry and kurtosis in addition to heavy tails.

Definition 1. The random variable X is said to have a PT distribution with parameter α and degree of freedom ν, if X has probability density function (PDF) given by

f P T ( x ; α , ν ) = α f T ( x ; ν ) [ F T ( x ; ν ) ] α 1 , (1)

for x and α>0. Functions fT(;ν) and FT(;ν) are the PDF and cumulative distribution function (CDF) of the standard Student-t distribution.

Random variable having fPT(x;α,ν) distribution is denoted shortly by XPT(α ,ν ). Figure 1 displays some forms of the PDF of the PN distribution for selected values of α. Note from figure that parameter α controls the skewness and kurtosis of the distribution. The CDF of the PT model is given by

F P T ( x ; α , ν ) = [ F T ( x ; ν ) ] α , (2)

for x. Some properties of the PT distribution can be proven as result of Definition 1.

Figure 1
Density function of PT(α ,10) for α=5 (solid line), α=2 (dashed line), α=1 (dotted line), α=0.5 (dotted-dashed line).

Proposition 1. Let XPT(α ,ν ), then

  • (i) if α=1, X follows Student-t distribution and we write XT(ν ),

  • (ii) if ν, X converges to power-normal (PN) model with parameter α. The PDF is given by

f P N ( x ; α ) = α ϕ ( x ) [ Φ ( x ) ] α 1 , x . (3)
  • More details of PN distribution can be found in Gupta & Gupta 2008GUPTA RD & GUPTA RC. 2008. Analyzing skewed data by power–normal model. Test 17: 197–210. and Pewsey et al. 2012PEWSEY A, GÓMEZ HW & BOLFARINE H. 2012. Likelihood–based inference for power distributions. Test 21(4): 775–789. .

  • (iii) if α=1 and ν, X converges to standard normal distribution.

Proof. Proof of (i)-(iii) are directly obtained from definition of PT distribution ◻

Proposition 2. Let XPT(α ,ν ), then for k

E [ X k ] = E [ ( F T 1 ( Y ; ν ) ) k ] , (4)

where Y has a beta distribution, and FT1(;ν) denotes the inverse of the function FT(;ν) .

Proof. We have by definition that

E [ X k ] = x k α f T ( x ; ν ) [ F T ( x ; ν ) ] α 1 d x

thus, letting y=FT(x;ν), then x=FT1(y;ν), it follows that

E [ X k ] = 0 1 ( F T 1 ( y ; ν ) ) k α y α 1 d y

which is the expected value of the function (FT1(Y;ν))k, where Y follows a beta distribution with parameter α and 1. ◻

The expected value, variance, indices of asymmetry and kurtosis of the PT model can be found by using the expressions

  • (i) E[X]=μ1

  • (ii) V[X]=μ2μ12

  • (iii) γ1=μ33μ1μ2+2μ13(μ2μ12)3/2

  • (iv) γ2=μ4(μ2)2

where μr=E[Xr] and μr=E[XE(X)]r. Table I presents the values of the asymmetry coefficient of the PT model for some values of the ν parameter and for values of α in the range of 0.1 to 100000.

Table I
Skewness of the PT(α ,ν ) model. Values of α ranging of 0.1 to 100000.

Definition 2. Let XPT(α ,ν ). The PT density of location and scale is defined as the distribution of Z=ξ+ηX, for ξ and η>0. The corresponding PDF is given by

f P T ( z ; ξ , η , α , ν ) = α f T ( z ξ η ; ν ) [ F T ( z ξ η ; ν ) ] α 1 , (5)

for z. We denote this extension as ZPT(ξ ,η ,α ,ν ), and we have that PT(0,1,α ,ν )PT(α ,ν ).

The kth moment of the random variable Z is given by

E [ Z k ] = j = 0 k ( k j ) ξ j η k j μ k j , (6)

where μj is the jth moment of a random variable XPT(α ,ν ). Zhao & Kim 2016ZHAO J & KIM HM. 2016.Power t distribution. Ommun Stat Appl Methods 23(4): 321–334. derived the information matrix for the location-scale version and showed that it is non-singular when α=1 for small values of the parameter ν (i.e., ν30). When ν tends to +, then PT distribution converges to PN model and here, we recall Pewsey et al. 2012PEWSEY A, GÓMEZ HW & BOLFARINE H. 2012. Likelihood–based inference for power distributions. Test 21(4): 775–789. showed that PN model has non-singular information matrix. This result guarantees that regularity conditions are satisfied for the likelihood approach, hence, with PT model, symmetry can be tested by using ordinary large sample properties of the likelihood ratio statistics.

POWER-STUDENT-T MODEL FOR CENSORED AND TRUNCATED DATA

Based on the goodness of the PT distribution to fit data with high indices of asymmetry and kurtosis, in this section, we introduce the censored PT and the truncated PT models which we will be denoted by CPT and TPT, respectively.

Definition 3 (Censored PT Model). Suppose that random variable Y follows a PT(ξ ,η ,α ,ν ) distribution. Let Y1,,Yn a random sample of size n of Y, where only those values of Yi greater than constant ki are recorded; and for values Yiki only the value ki is registered. The observed values Yio can be written as

Y i o = { k i , if Y i k i , Y i , if Y i > k i ,

for i=1,,n. The resulting sample is said to be a censored power-Student-t (CPT).

From Definition 3 it follows that P(Yio=ki)=P(Yiki)={FT(kiξη;ν)}α, and for the observations Yio=Yi the distribution of Yio is the same of Yi, i.e., YioPT(ξ ,η ,α ,ν ). For convenience, we choose to work with the case of left-censored data, however, the followings results can be extended to other types of censorship.

Maximum Likelihood Estimation for CPT Model

Let Y1o,,Yno be a random sample of the censored PT(ξ ,η ,α ,ν ) distribution (censored in ki). To perform statistical inference for parameter vector 𝛉=(ξ,η,α,ν) by using the ML method, we use the reparameterization in Olsen 1978OLSEN RJ. 1978. Note on the Uniqueness of the Maximum Likelihood Estimator for the Tobit Model. Econometrica 46(5): 1211–1215. . Thus, let γ=σξ and σ=1/η, the log-likelihood function for the new vector 𝛗=(γ,σ,α,ν), given the observed sample Yo can be written as

( φ ; Y o ) a m p ; = α i = 1 n ( 1 d i ) log { F T ( c i ; ν ) } a m p ; + i = 1 n d i { log α + log σ + log f T ( z i ; ν ) + ( α 1 ) log F T ( z i ; ν ) } (7)

where ci=σkiγ, zi=σyiγ, and di is an indicator variable defined as di=1 if Yio=Yi; and di=0 if Yio=ki. The components of the score function 𝐔(𝛗) are obtained by deriving partially the log-likelihood function given in (7) with regard to components γ, σ, α and ν, we obtain the following equations

U ( γ ) = α i = 1 n ( 1 d i ) r ( c i ; ν ) + i = 1 n d i { ( ν + 1 ν ) ( 1 + z i 2 ν ) 1 z i ( α 1 ) r ( z i ; ν ) } U ( σ ) = α i = 1 n ( 1 d i ) r ( c i ; ν ) k i + i = 1 n d i { 1 σ ( ν + 1 ν ) ( 1 + z i 2 ν ) 1 z i y i + ( α 1 ) r ( z i ; ν ) y i } U ( α ) = i = 1 n ( 1 d i ) log { F T ( c i ; ν ) } + i = 1 n d i { 1 α + log F T ( z i ; ν ) } U ( ν ) = α 2 i = 1 n ( 1 d i ) { ψ ( ν + 1 2 ) ψ ( ν 2 ) b 01 ( c i ; ν ) c i r ( c i ; ν ) ν } + 1 2 i = 1 n d i { ψ ( ν + 1 2 ) ψ ( ν 2 ) 1 ν log ( 1 + z i 2 ν ) + ( ν + 1 ν ) ( 1 + z i 2 ν ) 1 z i 2 ν } + α 1 2 i = 1 n d i { ψ ( ν + 1 2 ) ψ ( ν 2 ) b 01 ( z i ; ν ) z i r ( z i ; ν ) ν }

where r(x;ν)=fT(x;ν)/FT(x;ν), ψ(x)=ddxlogΓ(x) is the digamma function and bmn(cm;ν+m) is the truncated moment defined as

b m n ( c m ; ν + m ) = c m s m { log ( 1 + s 2 ν + m ) } n f T ( s ; ν + m ) F T ( c m ; ν + m ) d s , (8)

with cm=ν+mνc and c0=c. The moments bmn in (8) are obtained by numerical integration, for example, by using integrate function of R Development Core Team 2018R DEVELOPMENT CORE TEAM. 2018. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria. URL http://www.R-project.org. ISBN 3-900051-07-0.
http://www.R-project.org...
. The ML estimates of the parameters ξ, η, α and ν in censored PT model are obtained using iterative algorithms based on the score functions and by applying the inverse transformation ξ=𝛄/σ and η=1/σ. For obtaining the standard errors of the ML estimates one should compute the information matrix Iφ . It is well known that the elements of Iφ are given by

I φ ( i , k ) = E [ 2 ( φ ; Y o ) φ i φ k ] , i , k = 1 , , 4

where 𝛗=(γ,σ,α,ν). Since expectation over PT distribution and second-order derivatives are not straightforward, numerical methods should be performed to obtain the explicit form of the information matrix Iφ . Thus, we use the observed information matrix for calculating the standard errors in the rest of the paper. To recover the information matrix Iθ of the original parameterization 𝛉=(ξ,η,α,ν), we use

I θ = ( φ / θ ) I φ ( φ / θ ) ,

where the Jacobian matrix is

𝛗 𝛉 = ( 1 η 1 η 2 0 0 0 1 η 2 0 0 0 0 1 0 0 0 0 1 ) (9)

Definition 4 (Truncated PT Model). Let X be a random variable with distribution XPT(ξ ,η ,α ,ν ). Let a,b with a <b, such that P(a <X<b)>0. It is said that random variable Y has a truncated power-Student-t (TPT) distribution in the interval (a,b), if Y has the same distribution as XX(a,b). In this case, we write YTPT(a,b)(ξ ,η ,α ,ν ).

As a consequence of the Definition 4, the PDF of TPT distribution can be obtained as

f P T ( y ( a , b ) ) = α η f T ( y ξ η ; ν ) [ F T ( y ξ η ; ν ) ] α 1 × { [ F T ( b ξ η ) ] α [ F T ( a ξ η ) ] α } 1

if a <y<b, and fPT(y(a,b))=0 in otherwise. Now, we consider that before the sample to be selected, the distribution of Y is truncated at the value k, so that we can only choose observations such that Yk. Then, random variable Y has PDF given by

f P T ( y y k ) = α η f T ( y ξ η ; ν ) [ F T ( y ξ η ; ν ) ] α 1 [ F T ( k ξ η ; ν ) ] α (10)

for <yk.

Maximum Likelihood Estimation for TPT Model

Given a sample Y=(Y1,,Yn) of the TPT distribution in the value k, the log-likelihood function for vector 𝛗=(γ,σ,α,ν), where γ=σξ and σ=1/η, is given by

( φ ; Y ) = n 1 α log { F T ( c ; ν ) } + y i k n { log α + log σ + log f T ( z i ; ν ) + ( α 1 ) log F T ( z i ; ν ) } (11)

where c=σkγ, zi=σyiγ, and n1 is the number of observations in the sample such that <yk. Deriving partially the log-likelihood function (11) with respect to the components of the vector 𝛗 the following elements of the score function are obtained

U ( γ ) = n 1 α r ( c ; ν ) + y i k { ( ν + 1 ν ) ( 1 + z i 2 ν ) 1 z i ( α 1 ) r ( z i ; ν ) } U ( σ ) = n 1 α r ( c ; ν ) k + y i k { 1 σ ( ν + 1 ν ) ( 1 + z i 2 ν ) 1 z i + ( α 1 ) r ( z i ; ν ) } U ( α ) = n 1 log { F T ( c ; ν ) } + y i k { 1 α + log F T ( z i ; ν ) } U ( ν ) = n 1 α 2 { ψ ( ν + 1 2 ) ψ ( ν 2 ) b 01 ( c ; ν ) c r ( c ; ν ) ν } + 1 2 y i k { ψ ( ν + 1 2 ) ψ ( ν 2 ) 1 ν log ( 1 + z i 2 ν ) + ( ν + 1 ν ) ( 1 + z i 2 ν ) 1 z i 2 ν } + α 1 2 y i k { ψ ( ν + 1 2 ) ψ ( ν 2 ) b 01 ( z i ; ν ) z i r ( z i ; ν ) ν }

where r(x;ν)=fT(x;ν)/FT(x;ν), ψ(x)=ddxlogΓ(x) is the digamma function and bmn(cm;ν+m) is given by (8). To obtain the ML estimates of the parameters ξ, η, α and ν in the TPT model, we proceed in a similar way to the CPT model, and iterative methods based on the Newton-Rapshon algorithm are used with the score functions. We use the observed information matrix for calculating the standard errors and to recover the information matrix Iθ of the original parameterization 𝛉=(ξ,η,α,ν), we use Iθ =(φ /θ )Iφ (φ /θ ), where 𝛗/𝛉 is given in (9).

CENSORED POWER-STUDENT-T REGRESSION MODEL

In this section, we introduce the censored power-Student-t regression model, which is denoted by PTCR. This model results from the consideration of the observed random variable Yio=DiYi, where Di=I(0,+)(Yi) and Yi=xiβ +ε i, with ε iiidPT(0,η ,α ,ν ), for i=1,,n; i.e.,

Y i o = { x i β + ε i , if  Y i > 0 0 , if  Y i 0 , (12)

where 𝛃 is a vector of dimension p of unknown parameters, xi=(xi1,,xip), for i=1,,n, are vectors of known covariates, and εi, for i=1,,n, are independent random variables with PT distribution with location parameter 0, scale parameter η, shape parameter α, and degrees of freedom ν. This assumption is equivalent to considering that unobserved random variables Y1,,Yn are independent with YiPT(xiβ ,η ,α ,ν ), that is, with PDF given by

g ( y i ; θ ) = α η f T ( y i x i β η ; ν ) [ F T ( y i x i β η ; ν ) ] α 1

for i=1,,n, where 𝛉=(𝛃,η,α,ν). The contribution to likelihood function for observations Yio=0 is given by

P ( Y i o = 0 ) = { F T ( x i β η ; ν ) } α ,

and for observations Yio0, we have that ioPT(xiβ ,η ,α ,ν ). Therefore, the likelihood function of the PTCR model based on the observed sample Yo=(Y1o,,Yno) is given by

L ( θ ; Y o ) = i = 1 n { F T ( x i β η ; ν ) } α ( 1 d i ) { α η f T ( y i x i β η ; ν ) × [ F T ( y i x i β η ; ν ) ] α 1 } d i (13)

where

d i = { 1 , if Y i o > 0 , 0 , if Y i o = 0 ,

Model (13) can be extended to the situation where the value of the censorship associated with the observation i is replaced by the value ki (a known value), i.e.,

Y i o = { x i β + ε i , if  Y i > k i k i , if  Y i k i , (14)

for i=1,,n. Note that, by making Yio*=Yioki, xi=(xi,ki) and 𝛃*=(𝛃,1), we have the previous model in (12), hence, the results of the inference based on the ML method can be used to fit the more general model in (14).

Proposition 3. Consider the model (12) with assumption ε iiidPT(0,η ,α ,ν ), for i=1,,n, then

  1. (i) if α=1, model (12) is reduced to t-Student censored regression (TCR) model

  2. (ii) if ν, model (12) converges to power-normal censored regression (PNCR) model see Martínez-Flórez et al. 2013MARTÍNEZ-FLÓREZ G, BOLFARINE H & GÓMEZ HW. 2013. The alpha–power tobit model. Commun Stat Theory Methods 42(4): 633–643.

  3. (iii) if α=1 and ν, model (12) converges to usual normal censored regression (NCR) model, that is, Tobit model.

Proof. Proof of (i)-(iii) are directly obtained from definition of PTCR model. ◻

Moments

Proposition 4. The mean and variance for the ith observed response in PTCR model are given by

E [ Y i o ] = η ( c i + m 1 ( c i ; ν ) ) ( 1 [ F T ( c i ; ν ) ] α ) (15)

and

V a r [ Y i o ] = η 2 ( c i 2 + 2 c i m 1 ( c i ; ν ) ) ( [ F T ( c i ; ν ) ] α ) ( 1 [ F T ( c i ; ν ) ] α ) + η 2 ( m 2 ( c i ; ν ) [ m 1 ( c i ; ν ) ] 2 ( 1 [ F T ( c i ; ν ) ] α ) ) ( 1 [ F T ( c i ; ν ) ] α ) (16)

respectively, where ci=μi/η with μ i=xiβ and

m r ( c ; ν ) = 1 1 { F T ( c ; ν ) } α F T ( c ; ν ) 1 [ F T 1 ( u ; ν ) ] r α u α 1 d u , r = 1 , 2 .

Note that mr(c;ν) is the moment E[(FT1(U;ν))r] of a random variable U with distribution Beta(α ,1), truncated in the interval (FT(c;ν),1).

Proof. The mean and variance for the ith observed response in PTCR model can be obtained by noting that Yio=DiYi, where Di=I(0,+)(Yi) and Yi=μi+ηZi, with μ i=xiβ and ZiiidPT(α ,ν ), i=1,,n. We have

E [ Y i o ] = E [ Y i Y i > 0 ] P ( Y i > 0 ) = E [ μ i + η Z i μ i + η Z i > 0 ] P ( μ i + η Z i > 0 ) = ( μ i + η E [ Z i Z i > μ i η ] ) P ( Z i > μ i η ) = η ( μ i η + E [ Z i Z i + μ i η > 0 ] ) ( 1 P ( Z i μ i η ) ) = η ( c i + E [ Z i Z i + c i > 0 ] ) ( 1 [ F T ( c i ; ν ) ] α )

By Lemma 1 in Appendix A, we have

E [ Y i o ] = η ( c i + m 1 ( c i ; ν ) ) ( 1 [ F T ( c i ; ν ) ] α )

To obtain the variance of Yio, note that

E [ ( Y i o ) 2 ] = E [ Y i 2 Y i > 0 ] P ( Y i > 0 ) = E [ ( μ i + η Z i ) 2 μ i + η Z i > 0 ] P ( μ i + η Z i > 0 ) = ( μ i 2 + 2 η μ i E [ Z i Z i > μ i η ] + η 2 E [ Z i 2 Z i > μ i η ] ) P ( Z i > μ i η ) = η 2 ( μ i 2 η 2 + 2 μ i η E [ Z i Z i > μ i η ] + E [ Z i 2 Z i > μ i η ] ) ( 1 P ( Z i μ i η ) ) = η 2 ( c i 2 + 2 c i E [ Z i Z i + c i > 0 ] + E [ Z i 2 Z i + c i > 0 ] ) ( 1 [ F T ( c i ; ν ) ] α )

Using Lemma 1 in Appendix, it follows that

E [ ( Y i o ) 2 ] = η 2 ( c i 2 + 2 c i m 1 ( c i ; ν ) + m 2 ( c i ; ν ) ) ( 1 [ F T ( c i ; ν ) ] α )

thus, by calculating Var[Yio]=E[(Yio)2](E[Yio])2 and after some algebraic manipulations, we obtain

V a r [ Y i o ] = η 2 ( c i 2 + 2 c i m 1 ( c i ; ν ) ) ( [ F T ( c i ; ν ) ] α ) ( 1 [ F T ( c i ; ν ) ] α ) + η 2 ( m 2 ( c i ; ν ) [ m 1 ( c i ; ν ) ] 2 ( 1 [ F T ( c i ; ν ) ] α ) ) ( 1 [ F T ( c i ; ν ) ] α )

It is important to note that, if ν tends to infinity, then (15) and (16) converge to the mean and variance of the PNCR model (Martínez-Flórez et al. 2013MARTÍNEZ-FLÓREZ G, BOLFARINE H & GÓMEZ HW. 2013. The alpha–power tobit model. Commun Stat Theory Methods 42(4): 633–643. ), i.e., when ν

E [ Y i o ] = η ( c i + m 1 ( c i ) ) ( 1 [ Φ ( c i ) ] α ) V a r [ Y i o ] = η 2 ( c i 2 + 2 c i m 1 ( c i ) ) ( [ Φ ( c i ) ] α ) ( 1 [ Φ ( c i ) ] α ) + η 2 ( m 2 ( c i ) [ m 1 ( c i ) ] 2 ( 1 { Φ ( c i ) } α ) ) ( 1 [ Φ ( c i ) ] α )

where

m r ( c ) = 1 1 { Φ ( c ) } α Φ ( c ) 1 [ Φ 1 ( u ) ] r α u α 1 d u .

with Φ() the CDF of the standard normal distribution, and Φ1() the inverse function of Φ(). Also, worth noting that, if α=1 and ν, we have m1(ci;ν)ϕ(ci)/Φ(ci) and m2(ci;ν)(Φ(ci)ciϕ(ci))/Φ(ci), thus

E [ Y i o ] = η ( c i Φ ( c i ) + ϕ ( c i ) ) V a r [ Y i o ] = η 2 ( c i 2 Φ ( c i ) + 2 c i ϕ ( c i ) ) ( 1 Φ ( c i ) ) + η 2 ( Φ ( c i ) ϕ ( c i ) [ c i + ϕ ( c i ) ] )

which are the mean and variance, respectively, of the Tobit model (Tobin 1958TOBIN J. 1958.Estimation of relationship for limited dependent variables. Econometrica 26(1): 24–36. ).

Maximum Likelihood Estimation

The ML method is considered by using the reparameterization of Olsen 1978OLSEN RJ. 1978. Note on the Uniqueness of the Maximum Likelihood Estimator for the Tobit Model. Econometrica 46(5): 1211–1215. . Let 𝛄=σ𝛃, σ=1/η and

d i = { 1 , if  Y i o > k i , 0 , if  Y i o = k i ,

the log-likelihood function for 𝛗=(𝛄,σ,α,ν) obtained from (13) under the new parameterization is given by

( φ ; Y o ) = α i = 1 n ( 1 d i ) log { 1 F T ( c i ; ν ) } + i = 1 n d i { log α + log σ + log f T ( z i ; ν ) + ( α 1 ) log F T ( z i ; ν ) } (17)

where zi=σyici, with ci=xiγ . The components of the score function 𝐔(𝛗) are obtained by deriving (φ ;Yo) partially in relation to the components 𝛄, η, α and ν. After some algebraic manipulations the following components of the score function are obtained

U ( γ ) = α i = 1 n ( 1 d i ) r ( c i ; ν ) x i + i = 1 n d i { ( ν + 1 ν ) ( 1 + z i 2 ν ) 1 z i ( α 1 ) r ( z i ; ν ) } x i (18)
U ( σ ) = i = 1 n d i { 1 σ ( ν + 1 ν ) ( 1 + z i 2 ν ) 1 z i y i + ( α 1 ) r ( z i ; ν ) y i } (19)
U ( α ) = i = 1 n ( 1 d i ) log { 1 F T ( c i ; ν ) } + i = 1 n d i { 1 α + log F T ( z i ; ν ) } (20)
U ( ν ) = α 2 i = 1 n ( 1 d i ) { ψ ( ν + 1 2 ) ψ ( ν 2 ) b 01 ( c i ; ν ) c i r ( c i ; ν ) ν } R ( c i ; ν ) + 1 2 i = 1 n d i { ψ ( ν + 1 2 ) ψ ( ν 2 ) 1 ν log ( 1 + z i 2 ν ) } + 1 2 i = 1 n d i { ( ν + 1 ν ) ( 1 + z i 2 ν ) 1 z i 2 ν } + α 1 2 i = 1 n d i { ψ ( ν + 1 2 ) ψ ( ν 2 ) b 01 ( z i ; ν ) z i r ( z i ; ν ) ν } (21)

where r(x;ν)=fT(x;ν)/FT(x;ν), R(x;ν)=FT(x;ν)/(1FT(x;ν)), ψ(x)=ddxlogΓ(x) is the digamma function and bmn(cm;ν+m) is the truncated moment defined by (8). Note that, if α=1 the equations (18)-(21) are reduced to the functions of the TCR model (Arellano-Valle et al. 2012ARELLANO-VALLE RB, CASTRO LM, GONZÁLEZ-FARÍAS G & MUNÕZ-GAJARDO KA. 2012. Student-t Censored Regression Model: Properties and Inference. Stat Methods Appt 21(4): 453–473. doi:10.1007/s10260-012-0199-y.
10.1007/s10260-012-0199-y...
), while, if α=1 and ν, then r(c;ν)r(c)=ϕ(c)/Φ(c), R(c;ν)R(c)=Φ(c)/(1Φ(c)), and U(ν)0, therefore, the equations (18) and (19) are reduced to score functions of the Tobit model.

The elements of the observed information matrix Jφ for PTCR model, which are denoted by j𝛗i𝛗j, can be obtained by calculating the second partial derivative of the log-likelihood function (17), i.e., jφ iφ j=2(φ ;Yo)/φ iφ j, while the expected information matrix is obtained as Iφ =E[Jφ ], which involves the calculation of truncated expected values that have no closed form and must be obtained numerically. The Appendix B presents the expressions for the elements of the matrices Iφ and Jφ . The expected information matrix Iθ of the original parameterization 𝛉=(𝛃,η,α,ν) can be recovered by using Iθ =(φ /θ )Iφ (φ /θ ), where

φ θ = ( 1 η I p 1 η 2 β 0 0 0 1 η 2 0 0 0 0 1 0 0 0 0 1 )

Finally, the ML estimates for 𝛉=(𝛃,η,α,ν) can be obtained using iterative methods based on the Newton-Rapshon algorithm from the score function (18) - (21) and applying the inverse transformation 𝛃=𝛄/σ and η=1/σ. Estimates of the variances of the estimator can be obtained by evaluating the inverse of the observed information matrix Jφ 1 at the ML estimators 𝛗̂=(𝛄̂,σ̂,α̂,ν̂) and by using the previous result.

Model Selection and Residual Analysis

In this section some criteria for the selection of the best-fitted model and a methodology for residual analysis are proposed.

Model Selection

Many model selection tools are generally used, such as the Akaike information criteria (AIC), (Akaike 1974AKAIKE H. 1974.A new look at statistical model identification. IEEE Trans Automat Contr AU-19(4): 716–722. ), Bayesian information criterion (BIC) (Schwarz 1978SCHWARZ G. 1978. Estimating the dimension of a model. Ann Stat 6(2): 461–464. ), and the AIC corrected (AICc) (Sugiura 1978SUGIURA N. 1978. Further Analysis of the Data by Akaike’s Information Criterion and the Finite Corrections. Commun Stat Theory Methods 7(1): 13–26. doi:10.1080/03610927808827599. ), which are defined by

2 ( θ y ) + c ( k , n )

where the term c(k,n) is a quantity that depends on the number of free parameters that are estimated in the model k, and the number of observations in the sample n. For the AIC one has c(k,n)=2k, for BIC c(k,n)=klog(n) and for AICc, c(k,n)=2k(k+1)/(n(k+1)). To choose the best-fitted model, the criteria AIC, BIC and AICc are used.

Residual Analysis

The residual analysis has the purpose of detecting the presence of atypical observations and to evaluate the assumptions of the model, being able to include formal tests to detect departures from the assumptions of the considered model, as well as informal graphs to present general characteristics of the residuals.

Following Garay et al. 2017GARAY AM, LACHOS VH, BOLFARINE H & CABRAL CRB. 2017. Linear Censored Regression Models with Scale Mixtures of Normal Distributions. Stat Pap 58(1): 247–278. doi:10.1007/s00362-015-0696-9. and Arellano-Valle et al. 2012ARELLANO-VALLE RB, CASTRO LM, GONZÁLEZ-FARÍAS G & MUNÕZ-GAJARDO KA. 2012. Student-t Censored Regression Model: Properties and Inference. Stat Methods Appt 21(4): 453–473. doi:10.1007/s10260-012-0199-y.
10.1007/s10260-012-0199-y...
, in this work we considered the transformed martingal residuals rMTi proposed by Barros et al. 2010BARROS M, GALEA M, GONZÁLEZ M & LEIVA V. 2010. Influence Diagnostics in the Tobit Censored Response Model. Stat Methods Appt 19(3): 379–397. doi:10.1007/s10260-010-0135-y. as diagnostic tool to evaluate deviations from the postulated model for the response variable, as well as to detect the presence of atypical observations. The residuals are defined as

r M T i = s i g n ( r M i ) 2 [ r M i + δ i log ( δ i r M i ) ] , i = 1 , , n

where rMi=δi+logS(yi;𝛉̂) is the martingal residual proposed by Ortega et al. 2003ORTEGA EM, BOLFARINE H & PAULA GA. 2003. Influence Diagnostics in Generalized Log-Gamma Regression Models. Comput Stat Data Anal 42: 165–186. , where δi=0,1 indicates whether the ith observation is censored or not, respectively, sign(rMi) denotes the sign of rMi and S(yi;𝛉̂)=P𝛉̂(Yi>yi) represents the survival function evaluated at yi, where 𝛉̂ are the MLE for 𝛉.

As suggested by Garay et al. 2017GARAY AM, LACHOS VH, BOLFARINE H & CABRAL CRB. 2017. Linear Censored Regression Models with Scale Mixtures of Normal Distributions. Stat Pap 58(1): 247–278. doi:10.1007/s00362-015-0696-9. , this type of standardized residuals is used due to the fact that they are symmetrically distributed around zero, which facilitates the construction of the simulated envelopes with little computational effort and will be useful to detect an incorrect specification of the model, as well as the presence of observations atypical.

SIMULATIONS STUDIES

Simulation Study 1: Robustness of the Maximum Likelihood Estimates

In this section, we compare the performance of the estimates for PTCR model in the presence of outliers on the response variable. Following Garay et al. 2017GARAY AM, LACHOS VH, BOLFARINE H & CABRAL CRB. 2017. Linear Censored Regression Models with Scale Mixtures of Normal Distributions. Stat Pap 58(1): 247–278. doi:10.1007/s00362-015-0696-9. and Mattos et al. 2018MATTOS T, GARAY AM & LACHOS VH. 2018. Likelihood-based inference for censored linear regression models with scale mixtures of skew-normal distributions. J Appl Stat 45(11): 2039–2066. we performed a simulation study based on the NCR model. Specifically, we considered (12) with xi=(1,xi) and ε iN(0,η 2) for i=1,,n. As in Garay et al. 2017GARAY AM, LACHOS VH, BOLFARINE H & CABRAL CRB. 2017. Linear Censored Regression Models with Scale Mixtures of Normal Distributions. Stat Pap 58(1): 247–278. doi:10.1007/s00362-015-0696-9. we generated 1000 artificial samples of size n=300, considering 𝛃=(β1,β2)=(2,3.5), η=2 and fixing the left censoring level at p=10,20 and 30% (that is, 10, 20 and 30% of the observations in each data set were left censored, respectively). We generated independently the values xi, for i=1,,n, from a uniform distribution on the interval (2, 20). These values were fixed throughout the simulations.

To assess how much the ML estimates are influenced by the presence of outliers, we replaced the observation y150 by y150(δ)=y150+δ, with δ=1,2,,10. Let β̂i(δ) and β̂i be the ML estimates of βi with and without contamination, respectively, i=1,2. We are particularly interested in the relative changes

R C ( β ̂ i ( δ ) ) = | ( β ̂ i ( δ ) β ̂ i ) / β ̂ i |

We define the relative changes for η analogously. For each replication we obtained the parameter estimates with and without outliers, under the PTCR model. Table II and Figure 2 depict the average values of the relative changes across all samples and different censoring levels.

Table II
Average relative changes on estimates for different contaminations d and censoring level p.
Figure 2
Simulation study 1. Average relative changes on estimates for different contaminations d and censoring level.

We observe that influence increases dramatically when δ increases for p=10%, specially for the parameter. However, for the p=20 and 30%, these measures vary little, which indicates that PTCR model is more robust in these cases in the presence of discrepant observations.

Simulation Study 2: Asymptotic properties

To study the performance of the ML estimator 𝛉̂=(𝛃̂,η̂,α̂,ν̂), a Monte Carlo simulation study with sample sizes n=150, 300, 750, and 1000 is presented. We considered the PTCR model defined in Section “Censored Power-Student-t Regression Model” with xi=(1,xi) for i=1,,n. The true values of the parameters were taken as 𝛃=(2,1.5), η=1.5, α= 0.4, 2.5 and ν= 3.0. We also consider levels of censorship equal to p= 0, 10, 25 and 45%. The covariate xi was generated from a uniform distribution U(0.1,20) as considered in Garay et al. 2017GARAY AM, LACHOS VH, BOLFARINE H & CABRAL CRB. 2017. Linear Censored Regression Models with Scale Mixtures of Normal Distributions. Stat Pap 58(1): 247–278. doi:10.1007/s00362-015-0696-9. . For each combination of parameters, sample sizes and censorship levels, 2000 samples of the PTCR model were generated with errors ε iPT(0,η ,α ,ν ). To evaluate the performance of the estimators, the absolute value of the relative bias (RB) and the mean square error (MSE) were considered, they are given by

RB ( θ ̂ i ) = 1 2000 j = 1 2000 ( θ ̂ i ( j ) θ i 1 ) , MSE ( θ ̂ i ) = 1 2000 j = 1 2000 ( θ ̂ i ( j ) θ i ) 2 ,

respectively, where θ̂i(j) is the estimator of θi for the jth sample, for θi𝛉=(𝛃,η,α,ν). The ML estimates of the parameters were calculated by using the optim function of R Development Core Team 2018R DEVELOPMENT CORE TEAM. 2018. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria. URL http://www.R-project.org. ISBN 3-900051-07-0.
http://www.R-project.org...
. The optimization of the likelihood function was done using iterative methods based on the Newton-Rapshon algorithm by using the score functions.

It can be seen from the Table III that RB and MSE tend to decrease when the value of n increases, indicating that estimates based on the ML method have good asymptotic properties. That pattern is the same for the different levels of censorship of p under consideration. Note that, when the sample sizes is n=150, the estimates for the β0 parameter are unstable (in terms of MSE) because it is affected by the bias of the asymmetry parameter α, however, when the sample size increases, the estimates become more stable. In general, this problem is very common in these types of models, see for example, Martínez-Flórez et al. 2013MARTÍNEZ-FLÓREZ G, BOLFARINE H & GÓMEZ HW. 2013. The alpha–power tobit model. Commun Stat Theory Methods 42(4): 633–643. , so we recommend moderate and large sample sizes in these types of models

Table III
Performance of the ML estimators of β0, β1, η, α and ν for the PTCR model.

REAL DATA APPLICATIONS

Application 1: Wage rate

To illustrate the proposed model, we consider a data set described by Mroz 1987MROZ TA. 1987. The Sensitivity of an Empirical Model of Married Women’s Hours of Work to Economic and Statistical Assumptions. Econometrica 55(4): 765–799. doi:10.2307/1911029. . The data set consists of 753 married white women with ages between 30 and 60 years old in 1975, with 428 women that worked at some point during that year. The response variable used in this application is the wage rate, which represents a measure of the wage of the housewife known as the average hourly earnings. In data set, we have that if the wage rates are set equal to zero, these wives did not work in 1975. Therefore, these observations are considered left-censored at zero.

The considerated covariates were the wife’s age (X1), years of schooling (X2), the number of children younger than six years old in the household (X3), and the number of children between six and nineteen years old (X4). These data were analyzed previously by Arellano-Valle et al. 2012ARELLANO-VALLE RB, CASTRO LM, GONZÁLEZ-FARÍAS G & MUNÕZ-GAJARDO KA. 2012. Student-t Censored Regression Model: Properties and Inference. Stat Methods Appt 21(4): 453–473. doi:10.1007/s10260-012-0199-y.
10.1007/s10260-012-0199-y...
using a TCR model and later by Garay et al. 2017GARAY AM, LACHOS VH, BOLFARINE H & CABRAL CRB. 2017. Linear Censored Regression Models with Scale Mixtures of Normal Distributions. Stat Pap 58(1): 247–278. doi:10.1007/s00362-015-0696-9. using the Scale Mixture of Normal Censored Regression (SMNCR) models. We analyzed the data set by fitting a PTCR model and we compare our proposal with SMNCR models by Garay et al. 2017GARAY AM, LACHOS VH, BOLFARINE H & CABRAL CRB. 2017. Linear Censored Regression Models with Scale Mixtures of Normal Distributions. Stat Pap 58(1): 247–278. doi:10.1007/s00362-015-0696-9. : Student-t censored regression model (TCR) (Arellano-Valle et al. 2012ARELLANO-VALLE RB, CASTRO LM, GONZÁLEZ-FARÍAS G & MUNÕZ-GAJARDO KA. 2012. Student-t Censored Regression Model: Properties and Inference. Stat Methods Appt 21(4): 453–473. doi:10.1007/s10260-012-0199-y.
10.1007/s10260-012-0199-y...
), Slash censored regression model (SLCR), and normal censored regression model (NCR), that is, the usual tobit model. Table IV shows skewness and kurtosis index for complete data and also for uncensored observations. Notice that values for the skewness and kurtosis indexes justify using the PTCR.

Table IV
Statistical summary for wage rate data.

Table V presents parameter estimates together with their corresponding standard errors (SE) for the PTCR, TCR, SLCR and NCR models. To fit model of the PTCM, we use the R code in the Appendix C. Table VI presents some model selection criteria, together with the values of the log-likelihood. According to the AIC, BIC and AICc criteria, the PTCR model seems to yield a better fit to the Mroz’s data than the SMNCR models (TCR and SLCR models) and the usual Tobit model (NCR model), supporting the contention of a departure from symmetry of the errors. Also, the SE of the PTCR model are smaller than SE of the SMNCR and NCR models.

Table V
Parameters and standard errors (SE) of the PTCR, TCR, SLCR and NCR models fitted to Wage rate data.
Table VI
Wage rate data. Model selection criteria.

A more emphatic indication that an asymmetric model should be considered comes from testing the hypothesis a TCR model against an asymmetric (PTCR model), that is,

H 0 : α = 1 versus H 1 : α 1 ,

by using the likelihood ratio (LR) statistics, 2log(Λ)=2(TCR(𝛉̂)PTCR(𝛉̂)), which for the data set under study, leads to 2log(Λ)=52.41, which is greater than the critical 5% value with one degree of freedom which is given by χ1,95%2=3.84. This is an indication that the PTCR model fits Mroz’s data better than the ordinary TCR model.

Finally, in order to verify if there is any incorrect specification in the assumptions of the fitted model, the simulated envelope graphs for the transformed martingal residuals are shown in Figure 3. This figure indicates that the PTCR model is, apparently, more suitable for the adjustment of this data than the SMNCR models. It can also be observed that the SMNCR models with heavy tails fit the data better than the NCR model, since there are few observations that are outside the envelopes.

Figure 3
Wage rate data. Envelopes of transformed martingale residuals for PTCR, TCR, SLCR and NCR models.

Application 2: Stellar Abundances Data

The second censored dataset is described in Santos et al. 2002SANTOS N, LÓPEZ RG, ISRAELIAN G, MAYOR M, REBOLO R, GARCÍA-GIL A, TAORO MP DE & RANDICH S. 2002. Beryllium Abundances in Stars Hosting Giant Planets. Astron Astrophys 386: 1028–1038. and are available in the R package astrodatR (Feigelson 2014FEIGELSON ED. 2014. astrodatR: Astronomical Data. Available at. URL https://cran.r-project.org/web/packages/astrodatR/. R package v. 0.1.
https://cran.r-project.org/web/packages/...
) under the name Stellar abundances. These data were analyzed Mattos et al. 2018MATTOS T, GARAY AM & LACHOS VH. 2018. Likelihood-based inference for censored linear regression models with scale mixtures of skew-normal distributions. J Appl Stat 45(11): 2039–2066. by using the Scale Mixture of Skew Normal Censored Regression (SMSNCR) models. We analyzed the data set by fitting a PTCR model and again, we compare our proposal with SMNCR models by Garay et al. 2017GARAY AM, LACHOS VH, BOLFARINE H & CABRAL CRB. 2017. Linear Censored Regression Models with Scale Mixtures of Normal Distributions. Stat Pap 58(1): 247–278. doi:10.1007/s00362-015-0696-9. .

The dataset consists of measurements for 68 solar-type stars and for our analysis we followed Mattos et al. 2018MATTOS T, GARAY AM & LACHOS VH. 2018. Likelihood-based inference for censored linear regression models with scale mixtures of skew-normal distributions. J Appl Stat 45(11): 2039–2066. and consider:

  • log N(Be) as the response variable, which represents the log of the abundance of beryllium scaled to Sun’s abundance (i.e. the Sun has logN(Be)=0.0)

  • Teff/1000 as the explanatory variable, which represents the effective stellar surface temperature (in kelvin).

In astronomical research, a previously identified sample of objects (stars, galaxies, quasars, X-ray sources, etc.) is observed at some new wavebands. According to Feigelson 2014FEIGELSON ED. 2014. astrodatR: Astronomical Data. Available at. URL https://cran.r-project.org/web/packages/astrodatR/. R package v. 0.1.
https://cran.r-project.org/web/packages/...
, due to limited sensitivities, some objects may be undetected, leading to upper limits in their derived luminosities. For this dataset we have 12 left-censored data points, i.e. 12 undetected beryllium measurement, that represents 19.35% of observations.

Table VII presents the ML estimates for the parameters of the four models, i.e. PTCR, TCR, SLCR and NCR models, together with their corresponding standard errors. Table VIII compares the fit of the four models using the model selection criteria (AIC, AICc and BIC). Note that again the PTCR model with heavy tails have better fit than the TCR, SLCR and NCR models. The QQ-plots and envelopes for the martingale residuals are shown in Figure 4. This figure clearly indicates that the PTCR, TCR and SLCR models are more suitable for modeling the current data than the NCR model, since there are not observations falling outside the envelope.

Table VII
Parameters and standard errors (SE) of the PTCR, TCR, SLCR and NCR models fitted to stellar abundances data.
Table VIII
Stellar abundances data. Model selection criteria.
Figure 4
Stellar abundances data. Envelopes of transformed martingale residuals for PTCR, TCR, SLCR and NCR models.

CONCLUSIONS

In this paper, an asymmetric alternative for the Student-t censored regression model by Arellano-Valle et al. 2012ARELLANO-VALLE RB, CASTRO LM, GONZÁLEZ-FARÍAS G & MUNÕZ-GAJARDO KA. 2012. Student-t Censored Regression Model: Properties and Inference. Stat Methods Appt 21(4): 453–473. doi:10.1007/s10260-012-0199-y.
10.1007/s10260-012-0199-y...
and SMNCR by Garay et al. 2017GARAY AM, LACHOS VH, BOLFARINE H & CABRAL CRB. 2017. Linear Censored Regression Models with Scale Mixtures of Normal Distributions. Stat Pap 58(1): 247–278. doi:10.1007/s00362-015-0696-9. has been developed. It is based on the new family of asymmetric and heavy-tailed power-t distribution (Zhao & Kim 2016ZHAO J & KIM HM. 2016.Power t distribution. Ommun Stat Appl Methods 23(4): 321–334. ). Moreover, it follows that the ordinary tobit model and the Student-t censored regression models are special cases. The observed and expected information matrix is analytically obtained, allowing for the direct implementation of the inference on this type of models. The problem of estimating the parameters in the model is dealt by using the maximum likelihood approach which is also used for developing large sample properties for the estimators. The likelihood ratio statistics can be used for testing the PTCR null hypothesis since the TCR model is special case of the model entertained. Applications to Wage rate data and Stellar Abundances Data indicate that the PTCR model can be a useful alternative to the TCR and SMNCR models.

The proposed PT distribution can be considered in the statistical models based on the scale mixtures of normal family to improve the fit the models such as Maleki & Nematollahi 2017MALEKI M & NEMATOLLAHI AR. 2017. Autoregressive Models with Mixture of Scale Mixtures of Gaussian innovations. Iran J Sci Technol Trans A Sci 41(4): 1099–1107. . Also, the methodology of constructing the asymmetric distribution on the symmetric version of the Skew-Reflected-Gompertz distribution which recently introduced by Hosseinzadeh et al. 2019HOSSEINZADEH A, MALEKI M, KHODADADI Z & CONTRERAS-REYES JE. 2019. The Skew-Reflected-Gompertz distribution for analyzing the symmetric and asymmetric data. J Comput Appl Math 349: 132–141. , can be considered as a future work for researchers.

ACKNOWLEDGMENTS

  • AKAIKE H. 1974.A new look at statistical model identification. IEEE Trans Automat Contr AU-19(4): 716–722.
  • ARELLANO-VALLE RB, CASTRO LM, GONZÁLEZ-FARÍAS G & MUNÕZ-GAJARDO KA. 2012. Student-t Censored Regression Model: Properties and Inference. Stat Methods Appt 21(4): 453–473. doi:10.1007/s10260-012-0199-y
  • AZZALINI A & CAPITANIO A. 2003. Distributions generated by perturbation of symmetry with emphasis on a multivariate skew t-distribution. J R Stat Soc Series B Stat Methodol 65(2): 367–389.
  • BARROS M, GALEA M, GONZÁLEZ M & LEIVA V. 2010. Influence Diagnostics in the Tobit Censored Response Model. Stat Methods Appt 19(3): 379–397. doi:10.1007/s10260-010-0135-y.
  • FAIR RC. 1978.A theory of extramarital affairs. J Polit Econ 86(1): 45–61.
  • FEIGELSON ED. 2014. astrodatR: Astronomical Data. Available at. URL https://cran.r-project.org/web/packages/astrodatR/ R package v. 0.1.
    » https://cran.r-project.org/web/packages/astrodatR/
  • GARAY AM, LACHOS VH, BOLFARINE H & CABRAL CRB. 2017. Linear Censored Regression Models with Scale Mixtures of Normal Distributions. Stat Pap 58(1): 247–278. doi:10.1007/s00362-015-0696-9.
  • GUPTA RD & GUPTA RC. 2008. Analyzing skewed data by power–normal model. Test 17: 197–210.
  • HOSSEINZADEH A, MALEKI M, KHODADADI Z & CONTRERAS-REYES JE. 2019. The Skew-Reflected-Gompertz distribution for analyzing the symmetric and asymmetric data. J Comput Appl Math 349: 132–141.
  • MALEKI M & NEMATOLLAHI AR. 2017. Autoregressive Models with Mixture of Scale Mixtures of Gaussian innovations. Iran J Sci Technol Trans A Sci 41(4): 1099–1107.
  • MARTÍNEZ-FLÓREZ G, BOLFARINE H & GÓMEZ HW. 2013. The alpha–power tobit model. Commun Stat Theory Methods 42(4): 633–643.
  • MATTOS T, GARAY AM & LACHOS VH. 2018. Likelihood-based inference for censored linear regression models with scale mixtures of skew-normal distributions. J Appl Stat 45(11): 2039–2066.
  • MOULTON LH & HALSEY NA. 1995. A Mixture Model With Detection Limits for Regression Analyses of Antibody Response to Vaccine. Biometrics 51: 1570–1578.
  • MROZ TA. 1987. The Sensitivity of an Empirical Model of Married Women’s Hours of Work to Economic and Statistical Assumptions. Econometrica 55(4): 765–799. doi:10.2307/1911029.
  • OLSEN RJ. 1978. Note on the Uniqueness of the Maximum Likelihood Estimator for the Tobit Model. Econometrica 46(5): 1211–1215.
  • ORTEGA EM, BOLFARINE H & PAULA GA. 2003. Influence Diagnostics in Generalized Log-Gamma Regression Models. Comput Stat Data Anal 42: 165–186.
  • PEWSEY A, GÓMEZ HW & BOLFARINE H. 2012. Likelihood–based inference for power distributions. Test 21(4): 775–789.
  • R DEVELOPMENT CORE TEAM. 2018. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria. URL http://www.R-project.org ISBN 3-900051-07-0.
    » http://www.R-project.org
  • SANTOS N, LÓPEZ RG, ISRAELIAN G, MAYOR M, REBOLO R, GARCÍA-GIL A, TAORO MP DE & RANDICH S. 2002. Beryllium Abundances in Stars Hosting Giant Planets. Astron Astrophys 386: 1028–1038.
  • SCHWARZ G. 1978. Estimating the dimension of a model. Ann Stat 6(2): 461–464.
  • SUGIURA N. 1978. Further Analysis of the Data by Akaike’s Information Criterion and the Finite Corrections. Commun Stat Theory Methods 7(1): 13–26. doi:10.1080/03610927808827599.
  • TOBIN J. 1958.Estimation of relationship for limited dependent variables. Econometrica 26(1): 24–36.
  • ZHAO J & KIM HM. 2016.Power t distribution. Ommun Stat Appl Methods 23(4): 321–334.

APPENDIX A LEMMAS

Lemma 1. Let ZPT(α ,ν ), then E[ZkZ+c>0]=mk(c;ν), with

m k ( c ; ν ) = 1 1 { F T ( c ; ν ) } α F T ( c ; ν ) 1 [ F T 1 ( u ; ν ) ] k α u α 1 d u .

where FT1(;ν) is the inverse of FT(;ν).

Lemma 2. Let ZPT(α ,ν ), and define r(Z;ν)=fT(Z;ν)/FT(Z;ν). Then

  • (i)E{[r(Z;ν)]k(1+Z2ν)m/2ZnZ+c>0}=[fT(0;ν)]k1{FT(c;ν)}αakmn(c;ν), where

a k m n ( c ; ν ) = F T ( c ; ν ) 1 [ F T 1 ( u ; ν ) ] n { 1 + [ F T 1 ( u ; ν ) ] 2 ν } k ( ν + 1 ) + m 2 α u α k 1 d u
  • (ii)E{[r(Z;ν)]klog(1+Z2ν)ZnZ+c>0}=[fT(0;ν)]k1{FT(c;ν)}αakn(c;ν), where

    a k n ( c ; ν ) = F T ( c ; ν ) 1 log { 1 + [ F T 1 ( u ; ν ) ] 2 ν } [ F T 1 ( u ; ν ) ] n α u α k 1 d u

  • (iii) E{[r(Z;ν)]k[b01(Z;ν)]mZnZ+c>0}=[fT(0;ν)]k1{FT(c;ν)}αbkmn(c;ν), where

b k m n ( c ; ν ) = F T ( c ; ν ) 1 [ b 01 ( F T 1 ( u ; ν ) ) ] m [ F T 1 ( u ; ν ) ] n α u α k 1 d u
  • con b01(c;ν)=c{log(1+s2ν)}fT(s;ν)FT(c;ν)ds.

The proof of the Lemmas 1 and 2 are straightforward and they follow directly for the definition of expected value.

APPENDIX B INFORMATION MATRICES FOR PTCR MODEL

Observed Information Matrix

The elements of the observed information matrix 𝐉(𝛗) for the PTCR model are given by

j γ γ = α i = 1 n ( 1 d i ) { r ( c i ; ν ) ( ν + 1 ν ) ( 1 + c i 2 ν ) 1 c i } r ( c i ; ν ) x i x i + ( ν + 1 ν ) i = 1 n d i { ( 1 + z i 2 ν ) 1 2 ν ( 1 + z i 2 ν ) 2 z i 2 } x i x i + ( α 1 ) i = 1 n d i { r ( z i ; ν ) + ( ν + 1 ν ) ( 1 + z i 2 ν ) 1 z i } r ( z i ; ν ) x i x i j γ σ = ( ν + 1 ν ) i = 1 n d i { ( 1 + z i 2 ν ) 1 2 ν ( 1 + z i 2 ν ) 2 z i 2 } y i x i ( α 1 ) i = 1 n d i { r ( z i ; ν ) + ( ν + 1 ν ) ( 1 + z i 2 ν ) 1 z i } × r ( z i ; ν ) y i x i j γ ν = α 2 i = 1 n ( 1 d i ) { ( ψ ( ν + 1 2 ) ψ ( ν 2 ) ) ( 1 + R ( c i ; ν ) ) 1 ν ( 1 + c i r ( c i ; ν ) R ( c i ; ν ) ) R ( c i ; ν ) b 01 ( c i ; ν ) log ( 1 + c i 2 ν ) + ( ν + 1 ν ) ( 1 + c i 2 ν ) 1 c i 2 ν } r ( c i ; ν ) x i + 1 ν 2 i = 1 n d i { ( 1 + z i 2 ν ) 1 z i ( ν + 1 ν ) ( 1 + z i 2 ν ) 2 z i 3 } x i + α 1 2 i = 1 n d i { ( ν + 1 ν ) ( 1 + z i 2 ν ) 1 z i 2 ν + b 01 ( z i ; ν ) log ( 1 + z i 2 ν ) 1 ν ( 1 z i r ( z i ; ν ) ) } r ( z i ; ν ) x i
j γ α = i = 1 n ( 1 d i ) r ( c i ; ν ) x i + i = 1 n d i r ( z i ; ν ) x i j σ σ = i = 1 n d i { 1 σ 2 + ( ν + 1 ν ) [ ( 1 + z i 2 ν ) 1 2 ν ( 1 + z i 2 ν ) 2 z i 2 ] y i 2 } + ( α 1 ) i = 1 n I i { ( ν + 1 ν ) ( 1 + z i 2 ν ) 1 z i + r ( z i ; ν ) } r ( z i ; ν ) y i 2 j σ ν = 1 ν 2 i = 1 n d i { ( 1 + z i 2 ν ) 1 z i ( ν + 1 ν ) ( 1 + z i 2 ν ) 2 z i 3 } y i α 1 2 i = 1 n d i { ( ν + 1 ν ) ( 1 + z i 2 ν ) 1 z i 2 ν + b 01 ( z i ν ) log ( 1 + z i 2 ν ) 1 ν ( 1 z i r ( z i ; ν ) ) } r ( z i ; ν ) y i j σ α = i = 1 n d i r ( z i ; ν ) y i j ν α = 1 2 i = 1 n ( 1 d i ) { ψ ( ν + 1 2 ) ψ ( ν 2 ) b 01 ( c i ; ν ) c i ν r ( c i ; ν )   } × R ( c i ; ν ) + 1 2 i = 1 n d i { ψ ( ν + 1 2 ) ψ ( ν 2 ) b 01 ( z i ; ν ) z i ν r ( z i ; ν ) } j α α = i = 1 n d i α 2
j ν ν = + α 4 i = 1 n ( 1 d i ) { ψ ( ν + 1 2 ) ψ ( ν 2 ) b 01 ( c i ; ν ) c i ν r ( c i ν ) } 2 R ( c i ; ν ) ( 1 + R ( c i ; ν ) ) α 4 i = 1 n ( 1 d i ) { ψ 1 ( ν 2 ) ψ 1 ( ν + 1 2 ) 2 ν ( ν + 1 ) + b 01 2 ( c i ; ν ) 1 ν b 01 ( c i ; ν ) + 1 ν ( 2 b 01 ( c i ; ν ) ν + 3 ν ( ν + 1 ) ) c i r ( c i ; ν ) + 1 ν + 2 F T ( c 2 i ; ν + 2 ) F T ( c i ; ν ) b 21 ( c 2 i ; ν + 2 ) b 02 ( c i ; ν ) + 1 ν [ ( ν + 1 ν ) ( 1 + c i 2 ν ) 1 c i 2 ν log ( 1 + c i 2 ν ) + c i ν r ( c i ; ν ) ] × c i r ( c i ; ν ) } R ( c i ; ν ) 1 4 i = 1 n d i { ψ 1 ( ν + 1 2 ) ψ 1 ( ν 2 ) + 2 ν 2 4 ν 3 ( 1 + z i 2 ν ) 1 z i 2 + 2 ν 3 ( ν + 1 ν ) ( 1 + z i 2 ν ) 2 z i 4 } + α 1 4 i = 1 n d i { ψ 1 ( ν 2 ) ψ 1 ( ν + 1 2 ) 2 ν ( ν + 1 ) + b 01 2 ( z i ; ν ) 1 ν b 01 ( z i ; ν ) + 1 ν ( 2 b 01 ( z i ; ν ) ν + 3 ν ( ν + 1 ) ) z i r ( z i ; ν ) + 1 ν + 2 F T ( z 2 i ; ν + 2 ) F T ( z i ; ν ) b 21 ( z 2 i ; ν + 2 ) b 02 ( z i ; ν ) + 1 ν [ ( ν + 1 ν ) ( 1 + z i 2 ν ) 1 z i 2 ν log ( 1 + z i 2 ν ) + z i ν r ( z i ; ν ) ] z i r ( z i ; ν ) }
Expected Information Matrix

The expected information matrix Iφ is obtained by taking Iφ =E[Jφ ], which involves the calculation of truncated moments that have no closed form and must be obtained numerically. The elements I𝛗i𝛗j of the expected information matrix are given by

I γ γ = i = 1 n { r ( c i ; ν ) ( ν + 1 ν ) ( 1 + c i ν ) 1 c i } f P T ( c i ; α , ν ) x i x i + ( ν + 1 ν ) i = 1 n { a 020 ( c i ; ν ) 2 ν a 042 ( c i ; ν ) } x i x i + ( α 1 ) f T ( 0 ; ν ) i = 1 n { ( ν + 1 ν ) a 121 ( c i ; ν ) + f T ( 0 ; ν ) a 200 ( c i ; ν ) } x i x i I γ σ = 1 σ ( ν + 1 ν ) i = 1 n { a 021 ( c i ; ν ) + c i a 020 ( c i ; ν ) } x i + 2 σ ( ν + 1 ν 2 ) i = 1 n { a 043 ( c i ; ν ) + c i a 042 ( c i ; ν ) } x i ( f T ( 0 ; ν ) ) 2 ( α 1 σ ) i = 1 n { a 201 ( c i ; ν ) + c i a 200 ( c i ; ν ) } x i f T ( 0 ; ν ) ( α 1 σ ) ( ν + 1 ν ) i = 1 n { a 122 ( c i ; d ν ) + c i a 121 ( c i ; ν ) } x i I γ ν = α 2 i = 1 n { ( ψ ( ν + 1 2 ) ψ ( ν 2 ) ) ( 1 + R ( c i ; ν ) ) 1 ν ( 1 + c i r ( c i ; ν ) R ( c i ; ν ) ) R ( c i ; ν ) b 01 ( c i ; ν ) log ( 1 + c i 2 ν ) + ( ν + 1 ν ) ( 1 + c i 2 ν ) 1 c i 2 ν } f P T ( c i ; α , ν ) x i + 1 ν 2 i = 1 n { a 021 ( c i ; ν ) ( ν + 1 ν ) a 043 ( c i ; ν ) } x i + f T ( 0 ; ν ) ( α 1 2 ) i = 1 n { ( ν + 1 ν ) a 122 ( c i ; ν ) a 10 ( c i ; ν ) + b 110 ( c i ; ν ) 1 ν ( a 100 ( c i ; ν ) f T ( 0 ; ν ) a 201 ( c i ; ν ) ) } x i
I γ α = 1 α i = 1 n f P T ( c i ; α , ν ) x i + f T ( 0 ; ν ) i = 1 n a 100 ( c i ; ν ) x i I σ σ = 1 σ 2 i = 1 n { 1 + ( ν + 1 ν ) [ a 022 ( c i ; ν ) + 2 c i a 021 ( c i ; ν ) + c i 2 a 020 ( c i ; ν ) 2 ν ( a 044 ( c i ; ν ) + 2 c i a 043 ( c i ; ν ) + c i 2 a 042 ( c i ; ν ) ) ] } + f T ( 0 ; ν ) ( α 1 σ 2 ) i = 1 n { ( ν + 1 ν ) [ a 123 ( c i ; ν ) + 2 c i a 122 ( c i ; ν ) + c i 2 a 121 ( c i ; ν ) ] + f T ( 0 ; d ν ) [ a 202 ( c i ; ν ) + 2 c i a 201 ( c i ; ν ) + c i 2 a 200 ( c i ; ν ) ] } I σ α = f T ( 0 ; ν ) σ i = 1 n { a 101 ( c i ν ) + c i a 100 ( c i ; ν ) } I α α = 1 α 2 i = 1 n ( 1 { F T ( c i ; ν ) } α ) I σ ν = 1 σ ν 2 i = 1 n { a 022 ( c i ; ν ) + c i a 021 ( c i ; ν ) ( ν + 1 ν ) [ a 044 ( c i ; ν ) + c i a 043 ( c i ; ν ) ] } { F T ( c i ; ν ) } α f T ( 0 ; ν ) ( α 1 2 σ ) i = 1 n { ( ν + 1 ν 2 ) [ a 123 ( c i ; ν ) + c i a 122 ( c i ; ν ) ] a 11 ( c i ; ν ) c i a 10 ( c i ; ν ) + b 111 ( c i ; ν ) + c i b 110 ( c i ; ν ) 1 ν [ a 101 ( c i ; ν ) + c i a 100 ( c i ; ν ) ] + 1 ν [ f T ( 0 ; ν ) a 202 ( c i ; ν ) + c i f T ( 0 ; ν ) a 201 ( c i ; ν ) ] } I ν α = 1 2 α i = 1 n { ψ ( ν + 1 2 ) ψ ( ν 2 ) b 01 ( c i ; ν ) c i ν r ( c i ; ν ) } × ( r ( c i ; ν ) ) 1 f P T ( c i ; α , ν ) 1 2 i = 1 n { ψ ( ν + 1 2 ) ψ ( ν 2 ) b 010 ( c i ; ν ) 1 { F T ( c i ; ν ) } α f T ( 0 ; ν ) a 101 ( c i ; ν ) ν ( 1 { F T ( c i ; ν ) } α ) } ( 1 { F T ( c i ; ν ) } α )
I ν ν = α 4 i = 1 n { ψ ( ν + 1 2 ) ψ ( ν 2 ) b 01 ( c i ; ν ) c i ν r ( c i ; ν ) } 2 × R ( c i ; ν ) ( 1 + R ( c i ; ν ) ) { F T ( c i ; ν ) } α α 4 i = 1 n { ψ 1 ( ν 2 ) ψ 1 ( ν + 1 2 ) 2 ν ( ν + 1 ) + b 01 2 ( c i ; ν ) 1 ν b 01 ( c i ; ν ) + 1 ν ( 2 b 01 ( c i ; ν ) ν + 3 ν ( ν + 1 ) ) c i r ( c i ; ν ) b 02 ( c i ; ν ) + F T ( c 2 i ; ν + 2 ) b 21 ( c 2 i ; ν + 2 ) ( ν + 2 ) F T ( c 2 i ; ν ) + 1 ν [ ( ν + 1 ν ) ( 1 + c i 2 ν ) 1 c i 2 ν log ( 1 + c i 2 ν ) + c i ν r ( c i ; ν ) ] c i r ( c i ; ν ) } R ( c i ; ν ) { F T ( c i ; ν ) } α 1 4 i = 1 n { ψ 1 ( ν + 1 2 ) ψ 1 ( ν 2 ) + 2 ν 2 4 a 022 ( c i ; ν ) ν 3 ( 1 { F T ( c i ; ν ) } α ) + 2 ν 3 ( ν + 1 ν ) a 044 ( c i ; ν ) ( 1 { F T ( c i ; ν ) } α ) } ( 1 { F T ( c i ; ν ) } α ) + α 1 4 i = 1 n { ψ 1 ( ν 2 ) ψ 1 ( ν + 1 2 ) 2 ν ( ν + 1 ) + b 020 ( c i ; ν ) 1 { F T ( c i ; ν ) } α b 010 ( c i ; ν ) ν ( 1 { F T ( c i ; ν ) } α ) + 1 ν [ 2 f T ( 0 ; ν ) b 111 ( c i ; ν ) 1 { F T ( c i ν ) } α ν + 3 ν ( ν + 1 ) f T ( 0 ; ν ) a 101 ( c i ; ν ) ( 1 { F T ( c i ; ν ) } α ) ] E [ b 02 ( z i ; ν ) ] + ( 1 ν + 2 ) E [ F T ( z 2 i ; ν + 2 ) F T ( z i ; ν ) b 21 ( z 2 i ; ν + 2 ) ] + f T ( 0 ; ν ) ν ( 1 { F T ( c i ; ν ) } α ) [ ( ν + 1 ν 2 ) a 123 ( c i ; ν ) a 11 ( c i ; ν ) + 1 ν f T ( 0 ; ν ) a 202 ( c i ; ν ) ] } ( 1 { F T ( c i ; ν ) } α )

where b01, b02 and b21 are given by (8) and must be obtained numerically. The components akmn, akn and bkmn, are given in the Lemman 2.

APPENDIX C R CODE TO FIT THE PTCR MODEL

Publication Dates

  • Publication in this collection
    11 Oct 2021
  • Date of issue
    2021

History

  • Received
    11 Aug 2019
  • Accepted
    22 Oct 2019
Academia Brasileira de Ciências Rua Anfilófio de Carvalho, 29, 3º andar, 20030-060 Rio de Janeiro RJ Brasil, Tel: +55 21 3907-8100 - Rio de Janeiro - RJ - Brazil
E-mail: aabc@abc.org.br