Acessibilidade / Reportar erro

The Generalized Odd Lindley-G Family: Properties and Applications

Abstract

Abstract: We introduce a new class of continuous distributions called the generalized odd Lindley-G family. Four special models of the new family are provided. Some explicit expressions for the quantile and generating functions, ordinary and incomplete moments, order statistics and Rényi and Shannon entropies are derived. The maximum likelihood method is used for estimating the model parameters. The flexibility of the generated family is illustrated by means of two applications to real data sets.

Key words
generating function; lindley distribution; maximum likelihood; order statistic; T-X family


INTRODUCTION

In many applied sciences such as medicine, engineering and finance, among others, modeling and analyzing lifetime data are crucial. Several lifetime models have been adopted to model different types of survival data. The quality of the procedures used in a statistical analysis depends heavily on the generated family of distributions and considerable effort has been directed to define new statistical models. However, there still remain many important problems involving real data, which do not follow any of the popular statistical models. So, the procedure of expanding a family of distributions by adding new shape parameters is well-known in the statistical literature.

The role of the extra shape parameters is to introduce skewness and to vary tail weights. Further, several models have been constructed by extending some useful lifetime distributions and investigated them with respect to different characteristics. The generalized distributions provide more flexibility to model both monotonic and non-monotonic failure rates even though the baseline failure rate may be monotonic.

A large number of models has been proposed to model lifetime data. Lindley (1958)LINDLEY DV. 1958. Fiducial distributions and Bayes’ theorem. J R Stat Soc Series B 20: 102-107. proposed the Lindley (Li) distribution as a mixture of exponential and gamma distributions to analyze failure time data. The Li distribution is motivated by its ability to model failure time data with increasing, decreasing, unimodal and bathtub shaped hazard rates.

The properties, estimation and inference of the Li distribution are described in the literature as follows. Hussain (2006)HUSSAIN E. 2006. The non-linear functions of order statistics and their properties in selected probability models. Ph.D. thesis. Department of Statistics. University of Karachi, Pakistan. used this distribution for studying stress–strength reliability modeling. Ghitany et al. (2008)GHITANY M, ATIEH B and NADARAJAH S. 2008. Lindley distribution and its application. Math Comp Simul 78: 493-506. provided a comprehensive treatment of its mathematical properties showing that it may provide a better fitting than the exponential distribution.

Many authors developed generalizations of the Li distribution. For example, Sankaran (1970)SANKARAN M. 1970. The discrete Poisson–Lindley distribution. Biometrics 26: 145-149. introduced the discrete Poisson–Li, Zakerzadeh and Dolati (2009)ZAKERZADEH H and DOLATI A. 2009. Generalized Lindley Distribution. J Math Ext 3: 13-25. defined the generalized Li, Zamani and Ismail (2010)ZAMANI H and ISMAIL N. 2010. Negative binomial-Lindley distribution and its application. Journal of Mathematics and Statistics 6: 4-9. proposed the negative binomial-Li, Mahmoudi and Zakerzadeh (2010)MAHMOUDI E and ZAKERZADEH H. 2010. Generalized Poisson–Lindley distribution. Commun Stat - Theory Methods 39: 1785-1798. investigated the generalized Poisson–Li, Nadarajah et al. (2011)NADARAJAH S, BAKOUCH HS and TAHMASBI R. 2011. A generalized Lindley distribution. Sankhya B 73: 331-359. introduced a two-parameter generalized Li as an alternative to the gamma, lognormal, Weibull and exponentiated exponential distributions, Bakouch et al. (2012)BAKOUCH HS, AL-ZAHRANI BM, AL-SHOMRANI AA, MARCHI VAA and LOUZADA F. 2012. An extended Lindley distribution. J Korean Stat Soc 41: 75-85. proposed the extended Li motivated by its ability to model lifetime data with different shapes for the failure rate, Ghitany et al. (2013)GHITANY ME, AL-MUTAIRI DK, BALAKRISHNAN N and AL-ENEZI L. 2013. Power Lindley distribution and associated inference. Comput Stat Data Anal 64: 20-33. pioneered the power Li and Nedjar and Zeghdoudi (2016)NEDJAR S and ZEGHDOUDI H. 2016. Gamma Lindley distribution and its application. J Appl Probab Stat 11: 129-138. studied the gamma Li distribution.

In this paper, based on the T-X family pioneered by Alzaatreh et al. (2013)ALZAATREH A, LEE C and FAMOYE F. 2013. A new method for generating families of distributions. Metron 71: 63-79. and the Li distribution, we construct the generalized odd Lindley-G (GOLi-G for short) family, which extends the OLi-G class (Gomes-Silva et al. 2017)GOMES-SILVA FS, PERCONTINI A, BRITO E, RAMOS MW, VENÂNCIO R and CORDEIRO GM. 2017. The Odd Lindley-G Family of Distributions. Austrian Journal of Statistics 46: 65-87., and provide some of its mathematical properties. In fact, the new generator of distributions is motivated by its ability to model lifetime data with increasing, decreasing, constant, unimodal and bathtub shaped failure rates. Further, the special models of this family are shown to provide better fits than other competitive models generated by other well-known families in the literature.

The cumulative distribution function (cdf) and probability density function (pdf) of the GOLi-G family with two additional shape parameters α>0 and λ>0 are defined from a baseline cdf G(x;𝛗) by

F ( x ; α , λ , φ ) = λ 2 ( 1 + λ ) 0 G ( x ; φ ) α 1 G ( x ; φ ) α ( 1 + t ) e λ t d t = 1 { 1 + λ G ( x ; φ ) α ( 1 + λ ) [ 1 G ( x ; φ ) α ] } exp [ λ G ( x ; φ ) α 1 G ( x ; φ ) α ] (1)

and

f ( x ; α , λ , 𝛗 ) = α λ 2 g ( x ; 𝛗 ) G ( x ; 𝛗 ) α 1 ( 1 + λ ) [ 1 G ( x ; 𝛗 ) α ] 3 exp [ λ G ( x ; 𝛗 ) α 1 G ( x ; 𝛗 ) α ] , (2)

respectively, where g(x;𝛗)=dG(x;𝛗)/dx and 𝛗 is the baseline vector. Henceforth, a random variable with density (2) is denoted by X GOLi-G( α,λ,𝛗 ). For α=1 , we obtain as a special case the OLi-G class.

An interpretation of the GOLi-G family (1) can be given as follows. Let Y be a random variable describing a stochastic system by the cdf G(x)α (for α>0 ). If the random variable X represents the odds ratio, the risk that the system following the lifetime Y will be not working at time x is given by G(x)α/[1G(x)α] . If we are interested in modeling the randomness of the odds ratio by the Li pdf, r(t)=λ2(1+t)eλt/(1+λ) (for t>0 ), then the cdf of X is given by

P r ( X x ) = R ( G ( x ) α 1 G ( x ) α ) ,

which is exactly the cdf (1) of the GOLi-G family.

Henceforth, we can omit the dependence on the model parameters. The hazard rate function (hrf) of X is given by

τ ( x ) = α λ 2 g ( x ) G ( x ) α 1 ( 1 + λ ) [ 1 G ( x ) α ] 2 [ 1 + ( λ 1 ) G ( x ) α ] . (3)

The rest of the paper is organized as follows. In Section 2, we present four special models and plots of their pdfs and hazard rate functions (hrfs). In Section 3, we provide a very useful linear representation for the density function of X . In Section 4, we derive some of its general mathematical properties including quantile and generating functions, asymptotics, ordinary and incomplete moments, order statistics and entropies. Maximum likelihood estimation of the model parameters is addressed in Section 5. Section 6 is devoted to simulation results to assess the performance of the maximum likelihood estimation of the unknown parameters of the generalized odd Lindley Weibull (GOLiW) distribution. In Section 7, we provide two applications to real data to illustrate the flexibility of the special models of the new family. Finally, we offer some concluding remarks in Section 8.

FOUR SPECIAL GOLi-G MODELS

In this section, we provide four special models of the GOLi-G family. The pdf (2) will be most tractable when the cdf G(x) and the pdf g(x) have simple analytic expressions.

THE GOLiW DISTRIBUTION

Consider the cdf (for x>0 ) G(x)=1exp(axb) of the Weibull distribution with positive parameters a and b . Then, the pdf of the GOLiW model is given by

f ( x ) = α a b x b 1 exp ( a x b ) [ 1 exp ( a x b ) ] α 1 λ 2 ( 1 + λ ) { 1 [ 1 exp ( a x b ) ] α } 3 exp { λ [ 1 exp ( a x b ) ] α 1 [ 1 exp ( a x b ) ] α } .

For b=1 , we obtain the GOLi-exponential (GOLiE) distribution. For b=2 , we obtain the GOLi-Rayleigh (GOLiR) distribution. The GOLiW reduces to the OLiW when α=1 . We have the OLiE and OLiR when α=b=1 and α=b=1 , respectively. Some possible shapes of the GOLiW density and hazard functions are displayed in Figure 1.

Figure 1
Plots of the GOLiW pdf (a) and hrf (b) for some parameter values.

THE GOLi-BURR XII (GOLiBXII) DISTRIBUTION

The cdf (for x>0 ) of the Burr XII (BXII) distribution with positive parameters a and b is G(x)=1(1+xa)b . Then, the GOLiBXII density reduces to

f ( x ) = α a b λ 2 1 + λ x a 1 ( 1 + x a ) b 1 { 1 [ 1 ( 1 + x a ) b ] α } 3 × [ 1 ( 1 + x a ) b ] α 1 exp { λ [ 1 ( 1 + x a ) b ] α 1 [ 1 ( 1 + x a ) b ] α } .

For a=1 and b=1 , we obtain the one-parameter GOLi Lomax (GOLiLx) and one-parameter GOLi log-logistic (GOLiLL) distributions, respectively. The cases α=a=1 and α=b=1 refer to the OLiLx and OLiLL distributions, respectively. Plots of the density and hazard functions of the GOLiBXII distribution are displayed in Figure 2.

Figure 2
Plots of the GOLiBXII pdf (a) and hrf (b) for some parameter values.

The GOLi-Lomax (GOLiLx) distribution

The cdf (for x>0 ) of the Lomax (Lx) distribution with positive parameters a and b is G(x)=1[1+(x/a)]b . Then, the GOLiLx pdf becomes

f ( x ) = α b a 1 λ 2 ( 1 + λ ) [ 1 + ( x a ) ] b 1 ( 1 { 1 [ 1 + ( x a ) ] b } α ) 3 × { 1 [ 1 + ( x a ) ] b } α 1 exp ( λ { 1 [ 1 + ( x a ) ] b } α 1 { 1 [ 1 + ( x a ) ] b } α ) .

Plots of the density and hazard functions of the GOLiLx distribution are displayed in Figure Figure 3.

Figure 3
Plots of the GOLiLx pdf (a) and hrf (b) for some parameter values.

THE GOLi-LOG-LOGISTIC (GOLiLL) DISTRIBUTION

Consider the cdf (for x>0 ) G(x)=1[1+(xa)b]1 of the log-logistic (LL) distribution with positive parameters a and b . The GOLiLL density is given by

f ( x ) = α b λ 2 x b 1 ( 1 + λ ) a b [ 1 + ( x a ) b ] 2 ( 1 { 1 [ 1 + ( x a ) b ] 1 } α ) 3 × { 1 [ 1 + ( x a ) b ] 1 } α 1 exp ( λ { 1 [ 1 + ( x a ) b ] 1 } α 1 { 1 [ 1 + ( x a ) b ] 1 } α ) .

Plots of the density and hazard functions for the GOLiLL distribution for selected parameter values are displayed in Figure 4.

Figure 4
Plots of the GOLiLL pdf (a) and hrf (b) for some parameter values.

LINEAR REPRESENTATION

In this section, we provide a useful linear representation for the GOLi-G density. It can be expressed as

f ( x ) = α λ 2 g ( x ) G ( x ) α 1 ( 1 + λ ) [ 1 G ( x ) α ] 3 exp [ λ G ( x ) α 1 G ( x ) α ] .

Using the exponential series, we can write

f ( x ) = α g ( x ) ( 1 + λ ) j = 0 ( 1 ) j λ j + 2 G ( x ) ( j + 1 ) α 1 [ 1 G ( x ) α ] ( j + 3 ) A . (4)

Consider the power series

( 1 z ) q = n = 0 Γ ( q + n ) Γ ( q ) n ! z n , q > 0 . (5)

Applying this series to the quantity A , we obtain

A = k = 0 ( j + k + 2 ) ! k ! ( j + 2 ) ! G ( x ) k α . (6)

Substituting (6) in equation (4), we can write

f ( x ) = α g ( x ) ( 1 + λ ) j , k = 0 ( 1 ) j λ j + 2 ( j + k + 2 ) ! k ! ( j + 2 ) ! G ( x ) ( k + j + 1 ) α 1 . (7)

Then, the GOLi-G density can be expressed as a linear combination of exponentiated-G (Exp-G) densities

f ( x ) = j , k = 0 d j , k h ( k + j + 1 ) α ( x ) , (8)

where

d j , k = 1 ( 1 + λ ) j , k = 0 ( 1 ) j λ j + 2 ( j + k + 2 ) ! ( k + j + 1 ) k ! ( j + 2 ) !

and hδ(x)=δg(x)G(x)δ1 is the Exp-G density with power parameter δ>0 . Thus, several mathematical properties of the GOLi-G family can be determined from those properties of the Exp-G family. Equation (8) is the main result of this section.

The cdf of the GOLi-G family can also be expressed as a linear combination of Exp-G cdfs. By integrating (8), we have

F ( x ) = j , k = 0 d j , k H ( k + j + 1 ) α ( x ) ,

where Hδ(x) is the cdf of the Exp-G family with power parameter δ .

STRUCTURAL PROPERTIES

In this section, we obtain some structural properties of the GOLi-G family including quantile and generating functions, asymptotics, ordinary and incomplete moments, order statistics and entropies.

QUANTILE FUNCTION

Let QG()=G1() be the quantile function (qf) of the parent G. In this section, we provide two algorithms for simulating the GOLi-G model.

The first algorithm is based on generating random data from the Li distribution using the exponential-gamma mixture.

  • Algorithm 1 (Mixture form of the Li distribution)

  1. Generate U i U n i f o r m ( 0 , 1 ) , i = 1 , , n ;

  2. Generate V i E x p o n e n t i a l ( λ ) , i = 1 , , n ;

  3. Generate W i G a m m a ( 2 , λ ) , i = 1 , , n ;

If Uiλλ+1 , set Xi=G1([Vi1+Vi]1/α) ;otherwise, setXi=G1([Wi1+Wi]1/α)

The second algorithm is based on generating random data by inverting (1).

  • Algorithm 2 (Inverse cdf)

  1. Generate U i U n i f o r m ( 0 , 1 ) , i = 1 , , n ;

Set (for i=1,,n )

X i = G 1 ( [ λ + W 1 [ ( 1 U i ) ( λ + 1 ) exp ( λ 1 ) ] + 1 W 1 [ ( 1 U i ) ( λ + 1 ) exp ( λ 1 ) ] + 1 ] 1 / α ) , (9)

where W(x) is the negative branch of the Lambert function. The branches of this function are defined by x=W(xex),x . It is a two-valued function on the interval [1/e,0) . For W(x)1 , the function is denoted W1(x) and is called the negative branch. For W(x)>1 , the function is called the principal branch of the W function. The Lambert function cannot be expressed in terms of elementary functions.

ASYMPTOTICS

Let a=inf{x|G(x)}>0 . Then, the asymptotics of equations (1), (2) and (3) when xa are given by

F ( x ) λ G ( x ) α               as               x a , f ( x ) α λ g ( x ) G ( x ) α 1               as               x a , τ ( x ) α λ g ( x ) G ( x ) α 1               as               x a .

The asymptotics of equations (1), (2) and (3) when x are given by

1 F ( x ) λ α G ¯ ( x ) exp [ λ / α G ¯ ( x ) ]                as         x , f ( x ) λ 2 g ( x ) α 2 ( 1 + λ ) G ¯ ( x ) 3 exp [ λ / α G ¯ ( x ) ]                as         x , τ ( x ) λ g ( x ) α 2 ( 1 + λ ) G ¯ ( x ) 2                as         x .

GENERATING FUNCTION

Here, we obtain the moment generating function (mgf) M(t)=E(etX) of X . Henceforth, let T(k+j+1)α denote the random variable having the Exp-G class with power parameter (k+j+1)α .

We can write from equation (8)

M ( t ) = j , k = 0 d j , k M ( k + j + 1 ) α ( t ) = j , k = 0 d j , k τ k + j ( t ) ,

where τk+j(t)=(k+j+1)α01exp[tQG(u)]u(k+j+1)α1du and M(k+j+1)α(t) is the mgf of T(k+j+1)α . Hence, M(t) can be determined from the Exp-G generating function. If M(k+j+1)α(t) has an explicit expression, M(t) will also have a closed-form. For some baselines such as the exponential, Lomax, normal, gamma, among others, this is possible.

We define the cumulant generating function (cgf) of X by K(t)=log[M(t)] . The saddle–point approximation is of the main applications of the cgf in Statistics and provides highly accurate approximation formulae for the densities of the sum and mean of independent identically distributed (iid) random variables. Let X1,,Xn be iid random variables having common GOLi-G cgf K(t) . We define Sn=j=1nXj and obtain λ̂ from the (usual nonlinear) equation K(λ̂)=x/n and let y=[xnK(λ)]/nK(λ) . The density function of X¯n=Sn/n is followed from Daniels’ saddle–point approximation as

f X ¯ n ( y ) { n 2 π K ( λ ̂ ) } 1 / 2 exp [ n { K ( λ ̂ ) λ ̂ y } ] .

MOMENTS

The r th ordinary moment of X , say μr , follows from (8) as

μ r = E ( X r ) = j , k = 0 d j , k E ( T ( k + j + 1 ) α r ) . (10)

The central moment of X are easily obtained from (10).

The cumulants ( χn ) of X follow recursively from

χ n = μ n r = 0 n 1 ( n 1 r 1 ) χ r μ n r ,

where χ1=μ1χ2=μ2μ12,χ3=μ33μ2μ1+μ13 , etc. The measures of skewness and kurtosis can be calculated from the ordinary moments using well-known relationships.

The s th incomplete moment of X , say φs(t)=txsf(x)dx , can be expressed from (8) as

φ s ( t ) = j , k = 0 d j , k t x s h ( k + j + 1 ) α ( x ) d x = j , k = 0 d j , k v ( k + j + 1 ) α ( s ) ( t ) , (11)

where v(k+j+1)α(s)(t)=(k+j+1)α0G(t)QG(u)su(k+j+1)α1du can be computed numerically from the baseline qf QG(u) .

For s=1 , we obtain the first incomplete moment φ1(t) from (11). It can be applied to construct Bonferroni and Lorenz curves defined for a given probability π by B(π)=φ1(q)/(πμ1) and L(π)=φ1(q)/μ1 , respectively, where μ1 is given by (10) with r=1 and q=Q(π) is the qf of X at π obtained from (9). These curves are very useful in economics, reliability, demography, insurance and medicine.

ORDER STATISTICS

Order statistics make their appearance in many areas of statistical theory and practice. Let X1,,Xn be a random sample from the GOLi-G family. The pdf of Xi:n can be written as

f i : n ( x ) = f ( x ) B ( i , n i + 1 ) j = 0 n i ( 1 ) j ( n i j ) F ( x ) j + i 1 , (12)

where B(,) is the beta function.

Based on equation (1), we can write

F ( x ) j + i 1 = { 1 [ 1 + λ G ( x ) α 1 G ( x ) α ] exp [ λ G ( x ) α 1 G ( x ) α ] } j + i 1 .

After applying the generalized binomial series, we have

F ( x ) j + i 1 = l = 0 j + i 1 ( 1 ) l ( j + i 1 l ) [ 1 + λ G ( x ) α 1 G ( x ) α ] l exp [ λ l G ( x ) α 1 G ( x ) α ] . (13)

Using (2) and (13), we can write

f ( x ) F ( x ) j + i 1 = α λ 2 g ( x ) G ( x ) α 1 ( 1 + λ ) [ 1 G ( x ) α ] 3 l = 0 j + i 1 ( 1 ) l ( j + i 1 l ) × [ 1 + λ G ( x ) α 1 G ( x ) α ] l exp [ λ ( l + 1 ) G ( x ) α 1 G ( x ) α ] .

Using the exponential series, we obtain

f ( x ) F ( x ) j + i 1 = α λ 2 g ( x ) ( 1 + λ ) l = 0 j + i 1 s = 0 ( 1 ) l + s ( j + i 1 l ) [ λ ( l + 1 ) ] s [ 1 G ( x ) α ] s + 3 × [ 1 + λ G ( x ) α 1 G ( x ) α ] l G ( x ) ( s + 1 ) α 1 .

The Taylor series for zβ defined by

z β = k = 0 ( β ) k k ! ( z 1 ) k , (14)

holds, where (β)k=β(β1)(βk+1) is the descending factorial.

Applying (14) to [1+λG(x)α1G(x)α]l gives

[ 1 + λ G ( x ) α 1 G ( x ) α ] l = h = 0 ( l ) h λ h G ( x ) h α h ! [ 1 G ( x ) α ] h .

Then, the last equation reduces to

f ( x ) F ( x ) j + i 1 = l = 0 j + i 1 s , h = 0 ( 1 ) l + s λ s + h + 2 ( l + 1 ) s ( 1 + λ ) h ! ( l ) h ( j + i 1 l ) × α g ( x ) G ( x ) ( s + h + 1 ) α 1 [ 1 G ( x ) α ] s + h + 3 .

Applying the power series (5) to [1G(x)α](s+h+3) gives

f ( x ) F ( x ) j + i 1 = l = 0 j + i 1 s , h , k = 0 ( 1 ) l + s λ s + h + 2 ( s + h + k + 2 ) ! ( 1 + λ ) ( l + 1 ) s h ! k ! ( s + h + 2 ) ! × ( l ) h ( j + i 1 l ) α g ( x ) G ( x ) ( k + s + h + 1 ) α 1 .

Inserting the last equation in (12), the pdf of Xi:n can be expressed as

f i : n ( x ) = s , h , k = 0 m s , h , k h ( k + s + h + 1 ) α ( x ) , (15)

where, as before, hδ(x) is the Exp-G density with power parameter δ and

m s , h , k = j = 0 n i l = 0 j + i 1 ( 1 ) l + s + j λ s + h + 2 ( l + 1 ) s ( s + h + k + 2 ) ! ( 1 + λ ) ( k + s + h + 1 ) h ! k ! ( s + h + 2 ) × ( l ) h B ( i , n i + 1 ) ( n i j ) ( j + i 1 l ) .

Then, the density function of the GOLi-G order statistics is a linear combination of Exp-G densities. Based on equation (15), we note that the mathematical properties of Xi:n follow from those of T(k+s+h+1)α . For example, the q th moment of Xi:n is given by

E ( X i : n q ) = s , h , k = 0 m s , h , k E ( T ( k + s + h + 1 ) α q ) . (16)

Based upon the moments in equation (16), we can derive explicit expressions for the L-moments of X as infinite weighted linear combinations of the means of suitable GOLi-G order statistics.

ENTROPIES

The Rényi entropy of a random variable X represents a measure of variation of the uncertainty. The Rényi entropy is given by

I θ ( X ) = ( 1 θ ) 1 log ( f ( x ) θ d x ) , θ > 0 and θ 1 .

Using the pdf (2), we can write

f ( x ) θ = ( α λ 2 1 + λ ) θ g ( x ) θ G ( x ) θ ( α 1 ) [ 1 G ( x ) α ] 3 θ exp [ λ θ G ( x ) α 1 G ( x ) α ] .

Using the exponential series, we have

f ( x ) θ = ( α λ 2 1 + λ ) θ g ( x ) θ j = 0 ( 1 ) j ( λ θ ) j G ( x ) ( θ + j ) α θ [ 1 G ( x ) α ] ( 3 θ + j ) .

Applying the power series (5) to the last term, we have

f ( x ) θ = ( α λ 2 1 + λ ) θ j , k = 0 ( 1 ) j Γ ( 3 θ + j + k ) k ! ( λ θ ) j Γ ( 3 θ + j ) g ( x ) θ G ( x ) ( k + j + θ ) α θ .

Then, the Rényi entropy of the GOLi-G family reduces to

I θ ( X ) = ( 1 θ ) 1 log [ j , k = 0 b j , k g ( x ) θ G ( x ) ( k + j + θ ) α θ d x ] ,

where

b j , k = ( α λ 2 1 + λ ) θ ( 1 ) j Γ ( 3 θ + j + k ) k ! ( λ θ ) j Γ ( 3 θ + j ) .

The Shannon entropy of X , say SI=E{[logf(X)]} , is given by

S I = log ( α λ 2 1 + λ ) E { log [ g ( X ) ] } + ( 1 α ) E { log [ G ( X ) ] } + 3 E { log [ 1 G ( X ) α ] } + λ E { G ( X ) α / [ 1 G ( X ) α ] } .

First, we define and compute

A ( a 1 , a 2 ; α , λ ) = 0 1 u a 1 ( 1 u α ) a 2 exp ( λ u α 1 u α ) d u .

By using the exponential series, we obtain

A ( a 1 , a 2 ; α , λ ) = i = 0 ( 1 ) i i ! λ i 0 1 u a 1 + α i ( 1 u α ) a 2 + i d u .

Using the binomial expansion, we have

A ( a 1 , a 2 ; α , λ ) = i , j = 0 ( 1 ) i + j λ i i ! [ a 1 + α ( j + i ) ] ( a 2 i j ) .

The following proposition is used to determine the Shannon entropy of X .Proposition 1. Let X be a random variable with pdf given in (2). Then,

E { log [ G ( X ) ] } = α λ 2 1 + λ t A ( α + t 1 , 3 ; α , λ ) | t = 0 , E { log [ 1 G ( X ) α ] } = α λ 2 1 + λ t A ( α 1 , t + 3 ; α , λ ) | t = 0 and E { G ( X ) α / [ 1 G ( X ) α ] } = α λ 2 1 + λ A ( 2 α 1 , 4 ; α , λ ) .

MAXIMUM LIKELIHOOD ESTIMATION

Several approaches for parameter estimation were proposed in the literature but the maximum likelihood method is the most commonly employed. The maximum likelihood estimators (MLEs) enjoy desirable properties and can be used to obtain confidence intervals for the model parameters. The normal approximation for these estimators in large samples can be easily handled either analytically or numerically. Here, we consider the estimation of the unknown parameters of the new family from complete samples only by maximum likelihood.

Let X1,,Xn be a random sample from the GOLi-G family with parameters α,λ and 𝛗 . Let 𝛉= ( α,λ,𝛗 ) be the p×1 parameter vector. The log-likelihood function for 𝛉 is given by

( 𝛉 ) = n log ( α ) + 2 n log ( λ ) n log ( 1 + λ ) + ( α 1 ) i = 1 n log G ( x i ; 𝛗 ) + i = 1 n log g ( x i ; 𝛗 ) 3 i = 1 n log [ 1 G ( x i ; 𝛗 ) α ] λ i = 1 n G ( x i ; 𝛗 ) α 1 G ( x i ; 𝛗 ) α .

Then, the score vector components, 𝐔(𝛉)=𝛉=(Uα,Uλ,Uφk) , are

U α = n α + i = 1 n log G ( x i ; 𝛗 ) + 3 i = 1 n G ( x i ; 𝛗 ) α log G ( x i ; 𝛗 ) 1 G ( x i ; 𝛗 ) α λ i = 1 n G ( x i ; 𝛗 ) α log G ( x i ; 𝛗 ) [ 1 G ( x i ; 𝛗 ) α ] 2 ,
U λ = 2 n λ n 1 + λ i = 1 n G ( x i ; 𝛗 ) α 1 G ( x i ; 𝛗 ) α

and

U φ k = i = 1 n g k ( x i ; 𝛗 ) g ( x i ; 𝛗 ) + ( α 1 ) i = 1 n G k ( x i ; 𝛗 ) G ( x i ; 𝛗 ) + 3 α i = 1 n G k ( x i ; 𝛗 ) α 1 1 G ( x i ; 𝛗 ) α λ α i = 1 n G k ( x i ; 𝛗 ) α 1 [ 1 G ( x i ; 𝛗 ) α ] 2 ,

where gk(xi;𝛗)=g(xi;𝛗)/φk and Gk(xi;𝛗)=G(xi;𝛗)/φk.

Setting the nonlinear system of equations Uα=Uλ=0 and U𝛗=0 and solving them simultaneously yields the MLE 𝛉̂=(α̂,λ̂,𝛗̂) . For doing this, it is usually more convenient to adopt nonlinear optimization methods such as the quasi-Newton algorithm to maximize numerically. For interval estimation of the parameters, we obtain the p×p observed information matrix J(𝛉)={2rs} (for r,s=α,λ,φk ), which can be evaluated numerically.

Under standard regularity conditions, the distribution of 𝛉̂ can be approximated by a multivariate normal Np(0,J(𝛉̂)1) distribution when n to obtain confidence intervals for the parameters. Here, J(𝛉̂) is the total observed information matrix evaluated at 𝛉̂ . The method of the re-sampling bootstrap can be used for correcting the biases of the MLEs of the model parameters. Good interval estimates may also be obtained using the bootstrap percentile method.

SIMULATION STUDY

In this section, we present some simulation results that investigate the behavior of the MLEs in terms of the sample size n . All simulations are performed using the Rprogramming language (R Core Team 2017R CORE TEAM. 2017. R: A language and environment for statistical computing. R Foundation for Statistical Computing.).

The qf of the GOLiW distribution is given by

Q ( p ) = ( 1 a log { 1 [ λ + W 1 [ ( p 1 ) ( λ + 1 ) exp ( λ 1 ) ] + 1 W 1 [ ( p 1 ) ( λ + 1 ) exp ( λ 1 ) ] + 1 ] 1 α } ) 1 b .

We generate 2,000 random samples from this distribution using the last expression for three different sample sizes n=350 , n=550 and n=750 . The true values of the parameters are taken as: α=(0.2,1,2),λ=(0.5,2),a=(0.7,2.5,4.3) and b=(0.9,3,5) . For each sample size and parameter combination, the average MLEs and the mean square errors (MSEs) are computed. In order to save space, Table I gives only the results for α=2 , and are not reported for α=(0.2,1) . It can be verified that the estimates are stable and quite close the true parameter values for all sample sizes. Further, the MSEs decrease when the sample size increases in all cases in agreement with the first-order asymptotic theory.

TABLE I
Average values of the MLEs and the corresponding MSEs for α = 2 and (n=350, 550 and 750).

APPLICATIONS

In this section, we provide two applications to real data to illustrate the flexibility of the GOLiLx, GOLiLL and GOLiBXII distributions presented in Section 2. The goodness-of-fit statistics for these models are compared with other competitive models and the MLEs of the model parameters are determined.

The model selection is carried out using some goodness-of-fit measures including the Akaike information criterion ( AIC ), consistent Akaike information criterion ( CAIC ), Hannan-Quinn information criterion ( HQIC ), Bayesian information criterion ( BIC ), Anderson-Darling (A*) and Cramér-von Mises ( W* ). The smaller these statistics are, the better the fit.

APPLICATION 1: TIME-TO-FAILURE OF TURBOCHARGER DATA

The first data set consists of 40 times to failures ( 103 h) of turbocharger of one type of engine given in Xu et al. (2003)XU K, XIE M, TANG LC and HO SL. 2003. Application of neural networks in forecasting engine systems reliability. Appl Soft Comput 2: 255-268.. For these data, we shall compare the fit of the GOLiLx and GOLiLL distributions with those of other competitive models, namely: the Kumaraswamy transmuted log-logistic (KwTLL) (Afify et al. 2016AFIFY AZ, CORDEIRO GM, YOUSOF HM, ALZAATREH A and NOFAL ZM. 2016. The Kumaraswamy transmuted-G family of distributions: properties and applications. Journal of Data Science 14: 245-270.), Kumaraswamy log-logistic (KwLL) (de Santana et al. 2012DE SANTANA TVF, ORTEGA EMM, CORDEIRO GM and SILVA GO. 2012. The Kumaraswamy log-logistic distribution. Stat Theory Appl 3: 265-291.), transmuted log-logistic (TLL) (Granzotto and Louzada 2015GRANZOTTO DCT and LOUZADA F. 2015. The transmuted log-logistic distribution: modeling, inference and na application to a polled tabapua race time up to first calving data. Commun Stat - Theory Methods 44: 3387-3402.), McDonald log-logistic (McLL) (Tahir et al. 2014TAHIR MH, MANSOOR M, ZUBAIR M and HAMEDANI G. 2014. McDonald log-logistic distribution with an application to breast cancer data. Journal of Statistical Theory and Applications 13: 65-82.), beta log-logistic (BLL) (Lemonte 2014LEMONTE AJ. 2014. The beta log-logistic distribution. Braz J Probab Stat 28: 313-332.), generalized transmuted log-logistic (GTLL) (Nofal et al. 2017NOFAL ZM, AFIFY AZ, YOUSOF HM and CORDEIRO GM. 2017. The generalized transmuted-G family of distributions. Commun Stat - Theory Methods 46: 4119-4136.) and Li-log-logistic (LiLL) (Cakmakyapan and Ozel 2016CAKMAKYAPAN S and OZEL G. 2016. The Lindley family of distributions: properties and applications. Hacet J Math Stat 46: 1-27.) distributions, whose densities (for x>0 ) are given, respectively, by: KwTLL: f(x)=abβαβxβ1[1+(xα)β]2{1[1+(xα)β]1}a1

× { 1 λ + 2 λ [ 1 + ( x α ) β ] 1 } { 1 + λ [ 1 + ( x α ) β ] 1 } a 1

× { 1 [ 1 + λ [ 1 + ( x α ) β ] 1 ] a { 1 [ 1 + ( x α ) β ] 1 } a } b 1 ;

KwLL: f(x)=abβ α aβ xaβ 1[1+(xα )β ]a1{1[111+(xα )β ]a}b1;

TLL: f(x)=βαβxβ1[1+(xα)β]2{1λ+2λ[1+(xα)β]1};

McLL: f(x)=βcαB(a/c,b)(xα)βa1[1+(xα)β]a1

× { 1 [ 1 [ 1 + ( x α ) β ] 1 ] c } b 1 ;

BLL: f(x)=β α aβ B(a,b)xaβ 1[1+(xα )β ]ab;

GTLL: f(x)=βαβxβ1[1+(xα)β]2{1[1+(xα)β]1}a1

× { a ( 1 + λ ) λ ( a + b ) { 1 [ 1 + ( x α ) β ] 1 } b } ;

LiLL: f(x)=βλ2αβ(1+λ)xβ1[1+(xα)β]λ1(1log{[1+(xα)β]1}). The parameters of the above densities are all positive real numbers except for the KwTLL, TLL and GTLL models for which the parameter λ is |λ|1 .

APPLICATION 2: CANCER PATIENTS DATA

The second data set on the remission times (in months) of a random sample of 128 bladder cancer patients (Lee and Wang 2003LEE ET and WANG JW. 2003. Statistical methods for survival data analysis. 3rd ed., Wiley, New York.). For these data, we compare the fit of the GOLiBXII distribution with those of the OLiBXII, Weibull BXII (WBXII) (Afify et al. 2018AFIFY AZ, CORDEIRO GM, ORTEGA EMM, YOUSOF HM and BUTT NS. 2018. The four-parameter Burr XII distributions: properties, regression model and applications. Commun Stat - Theory Methods 47: 2605-2624.), Kumaraswamy exponentiated Burr XII (KwEBXII) (Mead and Afify 2017MEAD ME and AFIFY AZ. 2017. On five-parameter Burr XII distribution: properties and applications. South African Statistical Journal 51: 67-80.), McDonald Weibull (McW) (Cordeiro et al. 2014), beta BXII (BBXII) (Paranaíba et al. 2011PARANAÍBA PF, ORTEGA EMM, CORDEIRO GM and PESCIM RR. 2011. The beta Burr XII distribution with application to lifetime data. Comput Stat Data Anal 55: 1118-1136.), beta exponentiated BXII (BEBXII) (Mead 2014MEAD ME. 2014. The beta exponentiated Burr XII distribution. Journal of Statistics: Advances in Theory and Applications 12: 53-73.), LiBXII (Cakmakyapan and Ozel 2016) and BXII models with densities (for x>0 ), respectively, given by:

WBXII: f(x)=αλabxα1(1+xα)λb1[1(1+xα)λ]b1

× exp { a [ ( 1 + x α ) λ 1 ] b } ;

KwEBXII: f(x)=abckβxc1(1+xc)k+1[1(1+xc)k]aβ1

× { 1 [ 1 ( 1 + x c ) k ] a β } b 1 ;

McW: f(x)=βcαβB(a/c,b)xβ1e(αx)β[1e(αx)β]a1

× { 1 [ 1 e ( α x ) β ] c } b 1 ;

BBXII: f(x)=cθβcB(a,b)xc1[1+(xβ)c]θb1{1[1+(xβ)c]θ}a1;

BEBXII: f(x)=cθβB(a,b)xc1(1+xc)θ1[1(1+xc)θ]aβ1

× { 1 [ 1 ( 1 + x c ) θ ] β } b 1 ;

LiBXII: f(x)=abλ2(1+λ)xa1(1+xa)bλ1{1log[(1+xa)b]}. All the above parameters are positive real numbers.

Tables II and IV list the values of AIC , CAIC , HQIC , BIC , W* and A* , whereas the MLEs and their corresponding standard errors (in parentheses) of the model parameters are given in Tables III and V.

TABLE II
Goodness-of-fit statistics for time-to-failure data.
TABLE III
MLEs and their standard errors (in parentheses) for time-to-failure data.
TABLE IV
Goodness-of-fit statistics for cancer data.
TABLE V
MLEs and their standard errors (in parentheses) for cancer data.

The fitted GOLiLL and GOLiLx pdfs and other fitted pdfs for the time-to-failure data are displayed in Figure 5, whereas the PP-plots of these fitted models are displayed in Figure S7 (Supplementary Material) SUPPLEMENTARY MATERIAL Figure S7 - PP-plots of the GOLiLL and GOLiLx distributions and other competitive distributions. Figure S8 - Histological features of represrntative liver sectioons. . These plots reveal that the GOLiLx and GOLiLL distributions provide the best fits and can be considered very competitive models to other non-nested distributions.

Figure 5
The estimated GOLiLL and GOLiLx pdfs and other estimated pdfs for time-to-failure data. (a) The estimated GOLiLL, GOLiLx, KwTLL, KwLL and TLL densities. (b) The estimated GOLiLL, GOLiLx, McLL, BLL, GTLL and LiLL densities.

The plots of the fitted GOLiBXII pdf and other fitted pdfs defined before, for the cancer data, are displayed in Figure 6. The PP-plots of the fitted models are given in Figure S8. These plots reveal that the GOLiBXII distribution provides the best fits and it can be considered a very competitive model to other distributions with positive support.

Figure 6
The estimated GOLiBXII pdf and other estimated pdfs for cancer data. (a) The estimated GOLiBXII, WBXII, KwEBXII, McW and LiBXII densities. (b) The estimated GOLiBXII, BBXII, OLiBXII, BEBXII and BXII densities.

CONCLUSIONS

In this paper, we propose a new family of distributions with two extra positive parameters called the generalized odd Lindley-G family. The new family extends several widely known distributions and four of its special models are discussed. We demonstrate that its density function is a linear combination of exponentiated-G densities. We obtain some mathematical properties of the new family, which include quantile and generating functions, asymptotics, ordinary and incomplete moments, order statistics and entropies. The model parameters are estimated by the methods of maximum likelihood. Simulation results are reported for this method. Two real examples are used for illustration, where the new family does fit well both data sets.

REFERENCES

  • AFIFY AZ, CORDEIRO GM, ORTEGA EMM, YOUSOF HM and BUTT NS. 2018. The four-parameter Burr XII distributions: properties, regression model and applications. Commun Stat - Theory Methods 47: 2605-2624.
  • AFIFY AZ, CORDEIRO GM, YOUSOF HM, ALZAATREH A and NOFAL ZM. 2016. The Kumaraswamy transmuted-G family of distributions: properties and applications. Journal of Data Science 14: 245-270.
  • ALZAATREH A, LEE C and FAMOYE F. 2013. A new method for generating families of distributions. Metron 71: 63-79.
  • BAKOUCH HS, AL-ZAHRANI BM, AL-SHOMRANI AA, MARCHI VAA and LOUZADA F. 2012. An extended Lindley distribution. J Korean Stat Soc 41: 75-85.
  • CAKMAKYAPAN S and OZEL G. 2016. The Lindley family of distributions: properties and applications. Hacet J Math Stat 46: 1-27.
  • DE SANTANA TVF, ORTEGA EMM, CORDEIRO GM and SILVA GO. 2012. The Kumaraswamy log-logistic distribution. Stat Theory Appl 3: 265-291.
  • GHITANY ME, AL-MUTAIRI DK, BALAKRISHNAN N and AL-ENEZI L. 2013. Power Lindley distribution and associated inference. Comput Stat Data Anal 64: 20-33.
  • GHITANY M, ATIEH B and NADARAJAH S. 2008. Lindley distribution and its application. Math Comp Simul 78: 493-506.
  • GOMES-SILVA FS, PERCONTINI A, BRITO E, RAMOS MW, VENÂNCIO R and CORDEIRO GM. 2017. The Odd Lindley-G Family of Distributions. Austrian Journal of Statistics 46: 65-87.
  • GRANZOTTO DCT and LOUZADA F. 2015. The transmuted log-logistic distribution: modeling, inference and na application to a polled tabapua race time up to first calving data. Commun Stat - Theory Methods 44: 3387-3402.
  • HUSSAIN E. 2006. The non-linear functions of order statistics and their properties in selected probability models. Ph.D. thesis. Department of Statistics. University of Karachi, Pakistan.
  • LEE ET and WANG JW. 2003. Statistical methods for survival data analysis. 3rd ed., Wiley, New York.
  • LEMONTE AJ. 2014. The beta log-logistic distribution. Braz J Probab Stat 28: 313-332.
  • LINDLEY DV. 1958. Fiducial distributions and Bayes’ theorem. J R Stat Soc Series B 20: 102-107.
  • MAHMOUDI E and ZAKERZADEH H. 2010. Generalized Poisson–Lindley distribution. Commun Stat - Theory Methods 39: 1785-1798.
  • MEAD ME. 2014. The beta exponentiated Burr XII distribution. Journal of Statistics: Advances in Theory and Applications 12: 53-73.
  • MEAD ME and AFIFY AZ. 2017. On five-parameter Burr XII distribution: properties and applications. South African Statistical Journal 51: 67-80.
  • NADARAJAH S, BAKOUCH HS and TAHMASBI R. 2011. A generalized Lindley distribution. Sankhya B 73: 331-359.
  • NEDJAR S and ZEGHDOUDI H. 2016. Gamma Lindley distribution and its application. J Appl Probab Stat 11: 129-138.
  • NOFAL ZM, AFIFY AZ, YOUSOF HM and CORDEIRO GM. 2017. The generalized transmuted-G family of distributions. Commun Stat - Theory Methods 46: 4119-4136.
  • PARANAÍBA PF, ORTEGA EMM, CORDEIRO GM and PESCIM RR. 2011. The beta Burr XII distribution with application to lifetime data. Comput Stat Data Anal 55: 1118-1136.
  • R CORE TEAM. 2017. R: A language and environment for statistical computing. R Foundation for Statistical Computing.
  • SANKARAN M. 1970. The discrete Poisson–Lindley distribution. Biometrics 26: 145-149.
  • TAHIR MH, MANSOOR M, ZUBAIR M and HAMEDANI G. 2014. McDonald log-logistic distribution with an application to breast cancer data. Journal of Statistical Theory and Applications 13: 65-82.
  • XU K, XIE M, TANG LC and HO SL. 2003. Application of neural networks in forecasting engine systems reliability. Appl Soft Comput 2: 255-268.
  • ZAKERZADEH H and DOLATI A. 2009. Generalized Lindley Distribution. J Math Ext 3: 13-25.
  • ZAMANI H and ISMAIL N. 2010. Negative binomial-Lindley distribution and its application. Journal of Mathematics and Statistics 6: 4-9.

Publication Dates

  • Publication in this collection
    12 Aug 2019
  • Date of issue
    2019

History

  • Received
    12 Jan 2018
  • Accepted
    13 Nov 2018
Academia Brasileira de Ciências Rua Anfilófio de Carvalho, 29, 3º andar, 20030-060 Rio de Janeiro RJ Brasil, Tel: +55 21 3907-8100 - Rio de Janeiro - RJ - Brazil
E-mail: aabc@abc.org.br