Acessibilidade / Reportar erro

The Nadarajah-Haghighi Lindley distribution

Abstract

Abstract: We define a new lifetime model based on compounding the Lindley and Nadarajah-Haghighi distributions. The proposed distribution is very competitive to other lifetime models. Some of its mathematical properties are investigated including generating function, mean residual life, moments, Bonferroni and Lorenz curves and mean deviations. We discuss the estimation of the model parameters by maximum likelihood. We provide a simulation study and two applications to real data for illustrative purposes. We prove empirically that the new distribution yields good fits to both data sets, and it can be a useful alternative for other classical lifetime models.

Key words
Compounding approach; Exponential distribution; Lifetime data; Lindley distribution; Nadarajah-Haghighi distribution


1 - INTRODUCTION

The Lindley distribution was pioneered by(Lindley 1958)LINDLEY D. 1958. Fiducial distributions and Bayes’ theorem. J R Stat Soc A 20: 102-107. in the context of fiducial and Bayesian inference. It is a mixture of the exponential and length-biased exponential distributions. Let Y be a Lindley random variable with parameter γ>0 having probability density function (pdf)

g(y)=γ21+γ(1+y)eγy,y>0,
where the mixing proportion is γ/(1+γ). The survival function of Y is

G ¯ ( y ) = ( 1 + γ + γ y ) 1 + γ e γ y .

Various of its statistical properties were discussed in details by(Ghitany, Atieh, and Nadarajah 2008)GHITANY M, ATIEH B AND NADARAJAH S. 2008b. Lindley distribution and its application. Math Comput Simulat 78: 493-506.. The authors also showed that the Lindley distribution is quite competitive with the exponential distribution. (Gupta and Singh 2013)GUPTA PK AND SINGH B. 2013. Parameter estimation of Lindley distribution with hybrid censored data. Int J Syst Assur Eng Manag 4: 378-385. studied the parameter estimation of this distribution with hybrid censored data. (Krishna and Kumar 2011)KRISHNA H AND KUMAR K. 2011. Reliability estimation in Lindley distribution with progressively type II right censored sample. Math Comput Simulat 82: 281-294. considered the estimation of its model parameters for progressively type II right censored sample and(Mazucheli and Achcar 2011)MAZUCHELI J AND ACHCAR J. 2011. The Lindley distribution applied to competing risks lifetime data. Comput Meth Prog Bio 104: 188-192. applied this distribution to competing risks in lifetime data.

In distribution theory context, some generalizations are obtained based on transformations of the Lindley distribution. We refer the reader to(Nadarajah, Bakouch, and Tahmasbi 2011)NADARAJAH S, BAKOUCH HS AND TAHMASBI R. 2011. A generalized Lindley distribution. Sankhya B 73: 331-359. for the generalized (or exponentiated) Lindley,(Bakouch et. al 2012)BAKOUCH HS, AL-ZAHRANI BM, AL-SHOMRANI AA, MARCHI VA AND LOUZADA F. 2012. An extended Lindley distribution. J Korean Stat Soc 41: 75-85. for the extended Lindley,(Ghitany et. al 2011)GHITANY M, ALQALLAF F, AL-MUTAIRI D AND HUSAIN H. 2011. A two-parameter weighted Lindley distribution and its applications to survival data. Math Comput Simulat 81: 1190-1201. and(Al-Mutairi, Ghitany, and Kundu 2015)AL-MUTAIRI DK, GHITANY ME AND KUNDU D. 2015. Inferences on Stress-Strength Reliability from Weighted Lindley Distributions. Commun Stat Theory Methods 44: 4096-4113. for the weighted Lindley,(Ghitany et. al 2013)GHITANY ME, AL-MUTAIRI DK, BALAKRISHNAN N AND AL-ENEZI LJ. 2013. Power Lindley Distribution and Associated Inference. Comput Stat Data An 64: 20-33. for the power Lindley and(Ashour and Eltehiwy 2015)ASHOUR SK AND ELTEHIWY MA. 2015. Exponentiated power Lindley distribution. J Adv Res 6: 895-905. for the exponentiated power Lindley distributions.

Another technique that has been considered is the discrete-continuous compounding approach. It is defined by the minimum of N independent and identically continuous random variables, where N is a discrete random variable. (Adamidis and Loukas 1998)ADAMIDIS K AND LOUKAS S. 1998. A lifetime distribution with decreasing failure rate. Stat Probabil Lett 39: 35-42 pioneered this method by introducing the exponential geometric distribution. We also find in the literature some models obtained from compositions of the Lindley and other discrete distributions. (Sankaran 1970)SANKARAN M. 1970. The discrete Poisson–Lindley distribution. Biometrics 26: 145-149. defined the discrete Poisson-Lindley by combining the Poisson and Lindley distributions.(Zamani and Ismail 2010)ZAMANI H AND ISMAIL N. 2010. Negative binomial-Lindley distribution and its application. J Math Stat 6: 4-9. presented the negative binomial Lindley distribution. The zero-truncated Poisson-Lindley and Pareto Poisson-Lindley distributions were introduced by(Ghitany, Al-Mutairi, and Nadarajah 2008)GHITANY M, AL-MUTAIRI D AND NADARAJAH S. 2008a. Zero-truncated Poisson–Lindley distribution and its application. Math Comput Simulat 79: 279-287. and(Asgharzadeh, Bakouch, and Esmaeili 2013)ASGHARZADEH A, BAKOUCH HS AND ESMAEILI L. 2013. Pareto Poisson–Lindley distribution with applications. J Appl Stat 40: 1717-1734., respectively.

The Nadarajah-Haghighi (NH) distribution is a generalization of the exponential distribution first defined by (Nadarajah and Haghighi 2011)NADARAJAH S AND HAGHIGHI F. 2011. An extension of the exponential distribution. Statistics 45: 543-558.. Let Z denote a NH random variable with parameters α>0 and λ>0. The pdf and survival function of Z

q ( z ) = α λ ( 1 + λ z ) α 1 e 1 ( 1 + λ z ) α

and

Q ¯ ( z ) = e 1 ( 1 + λ z ) α ,

respectively. Some generalizations of the NH distribution have been proposed in recent years, such as the exponentiated Nadarajah-Haghighi (Lemonte (2013)LEMONTE AJ. 2013. A new exponential-type distribution with constant, decreasing, increasing, upside-down bathtub and bathtub-shaped failure rate function. Comput Stat Data An 62: 149-170.) and gamma Nadarajah-Haghighi (Bourguignon et. al (2015)BOURGUIGNON M, LIMA MDCS, LEÃO J, NASCIMENTO ADC, PINHO LGB AND CORDEIRO GM. 2015. A new generalized gamma distribution with applications. Am J Math 34: 309-342.), among others. By using the discrete-continuous compounding approach, we have the Poisson gamma Nadarajah-Haghighi (Ortega et. al (2015)ORTEGA EM, LEMONTE AJ, SILVA GO AND CORDEIRO GM. 2015. New flexible models generated by gamma random variables for lifetime modeling. J Appl Stat 42: 2159-2179.) and geometric Nadarajah-Haghighi (Marinho (2016)MARINHO PRD. 2016. Some new families of continuous distributions. Master’s thesis. Doctoral thesis, Universidade Federal de Pernambuco.) distributions.

A comprehensive review on compounding method for generating distributions can be found in (Tahir et. al 2016)TAHIR MH, CORDEIRO GM, ALZAATREH A, MANSOOR M AND ZUBAIR M. 2016. The logistic-X family of distributions and its applications. Commun Stat Theory Methods 45: 7326-7349.. They pointed out a different compounding approach by taking the minimum between two continuous distributions. In the discrete-continuous compositions, N is a discrete random variable representing the number of identical elements having some continuous distribution. For the continuous-continuous compositions, we suppress the condition to be identically distributed and set N=2. Some well-known continuous-continuous compounded models are the additive Weibull ((Xie and Lai 1995)XIE M AND LAI C. 1995. Reliability analysis using an additive Weibull model with bathtub shaped failure rate function. Reliab Eng Syst Safe 52: 87-93., Lemonte, Cordeiro, and Ortega (2014)LEMONTE AJ, CORDEIRO GM AND ORTEGA EMM. 2014. On the Additive Weibull Distribution. Commun Stat Theory Methods 43: 2066-2080.), exponential-Weibull (Cordeiro et al. 2014CORDEIRO G, ORTEGA E AND LEMONTE AJ. 2014. The exponential–Weibull lifetime distribution. J Stat Comput Sim 47:626-653.) and generalized exponential-exponential (Popovíc, Ristíc, and Cordeiro (2015)POPOVÍC B, RISTÍC M AND CORDEIRO G. 2015. A two parameter distribution obtained by compounding the generalized exponential and exponential distributions. Mediterr J Math 15: 2935-2949. ) distributions, among others.

In this paper, we introduce a new continuous-continuous compounded model referred to as the Nadarajah-Haghighi Lindley (NHL) distribution. The new three-parameter distribution is obtained by compounding the Lindley and NH distributions. We assume that Y and Z are independent random variables and define X=min(Y,Z) as a NHL random variable, whose survival function is given by

F¯(x)=G¯(x)Q¯(x).
The cumulative distribution function (cdf) of X is

F ( x ) = 1 ( 1 + γ + γ x ) 1 + γ e 1 γ x ( 1 + λ x ) α , (1)

where x>0. The parametric space of the cdf in(1) can be defined as

Θ = { ( α , λ , γ ) : α 0 , γ 0 , λ 0 , max ( α , γ ) > 0 , max ( λ , γ ) > 0 } .

The pdf and hazard rate function (hrf) of X are given by

f ( x ) = ( 1 + γ + γ x ) [ γ + α λ ( 1 + λ x ) α 1 ] γ γ + 1 e 1 γ x ( 1 + λ x ) α (2)

and

h ( x ) = ( 1 + γ + γ x ) [ γ + α λ ( 1 + λ x ) α 1 ] γ γ + γ x + 1 ,

respectively. Henceforth, we consider that XNHL(α,λ,γ). The proposed distribution contains as special models some well-known distributions. For γ=0, the NHL reduces to the NH distribution. If γ=0 and α=1, we have the exponential distribution. For α=0 or λ=0, we have the Lindley distribution.

Figure1 displays plots of the pdf of X for some parameter values. The new density presents decreasing and reverse J shaped curve. Figure2 reveals that the NHL distribution can have decreasing, increasing, upside-down bathtub and bathtub-shaped hazard functions. This feature makes the new distribution very competitive to the Weibull, gamma and exponential distributions that exhibit only monotonic hazard rates. According to (Nadarajah, Bakouch, and Tahmasbi 2011)NADARAJAH S, BAKOUCH HS AND TAHMASBI R. 2011. A generalized Lindley distribution. Sankhya B 73: 331-359. this is a major weakness because most empirical life systems have bathtub shapes for their hrfs.

Figure 1
Plots of the NHL density.
Figure 2
Plots of the NHL hazard function.

The NHL distribution has the following theoretical motivations: (i) It can be useful in engineering and reliability for modeling a system having two sub-systems functioning in series independently at a given time. If Y and Z denote the lifetimes of these independent sub-systems following the Lindley and NH distributions, then the lifetime X of the system has the NHL distribution. (Cordeiro, Ortega, and Lemonte 2014)CORDEIRO G, ORTEGA E AND LEMONTE AJ. 2014. The exponential–Weibull lifetime distribution. J Stat Comput Sim 47:626-653. studied a similar situation for the exponential-Weibull model. (ii) The stochastic representation X=min{Y,Z} can arise in several biological and medical applications. (iii) (Nadarajah and Haghighi 2011)NADARAJAH S AND HAGHIGHI F. 2011. An extension of the exponential distribution. Statistics 45: 543-558. mentioned some advantages of the NH model such as the ability to model data with mode fixed at zero and the fact that it can be interpreted as a truncated Weibull distribution. The NHL distribution also accumulates this advantages since it has as special model the NH distribution. (iv) Some of its mathematical properties are easily obtained. Further, the NHL distribution has some practical motivations: (i) It can be used for modeling data with bathtub and unimodal failure rates, which are very common in many applied areas. (ii) The finite sample behavior of the maximum likelihood estimates of the its parameters is adequate. (iii) It is really a very competitive model to well-known distributions such as the Weibull, exponentiated Weibull, NH and Lindley distributions as proved empirically in two applications to real data in sec 7.

The rest of this paper is outlined as follows. In secs 2-4, we obtain a range of mathematical properties of the NHL distribution. In sec 5, we consider the maximum likelihood method to estimate the model parameters. We perform a simulation study in sec 6. Two real data applications are provided in sec 7. Some concluding remarks are offered in sec 8.

2 - MEAN RESIDUAL LIFE

The mean residual life is a relevant characteristic to the design of safe systems in a wide variety of applications in engineering and reliability. Given that a component survives up to time x>0, the residual life is defined by

m ( x ) = I E ( X x | X > x ) = 1 1 F ( x ) x [ 1 F ( t ) ] d t .

It represents the period beyond x until the time of failure. Note that

m ( x ) = e x γ + ( 1 + λ x ) α ( 1 + γ + γ x ) x ( 1 + γ + γ t ) e γ t ( 1 + λ t ) α d t .

We consider the integral

J = x ( 1 + γ + γ t ) e γ t ( 1 + λ t ) α d t .

Setting u=(1+λt)α, we have t=(u1/α1)/λ. So, the above integral reduces to

J = 1 α λ ( 1 + λ x ) α u ( 1 α ) / α [ 1 + γ + γ λ 1 ( u 1 / α 1 ) ] e u γ λ 1 ( u 1 / α 1 ) d u .

By expanding

e γ λ 1 ( u 1 / α 1 ) = i = 0 ( 1 ) i γ i ( u 1 / α 1 ) i λ i i ! ,

and using the binomial theorem for (u1/α1)i, we can write

m ( x ) = e γ x + ( 1 + λ x ) α α λ ( 1 + γ + γ x ) i = 0 ( γ λ ) i Ψ i ( α , λ , γ , x ) , (3)

where

Ψ i ( α , λ , γ , x ) = j = 0 i ( 1 ) j ( i j ) ! j ! [ γ λ Γ ( j + 2 α , ( 1 + λ x ) α ) + ( 1 + γ γ λ ) Γ ( j + 1 α , ( 1 + λ x ) α ) ] .

and Γ(a,z)=Γ(a)γ(a,z)=zta1etdt is the upper incomplete gamma function. Note that (3) can be approximated by setting large values in the upper limit of the sum, say N. Hence,

m ( x ) e γ x + ( 1 + λ x ) α α λ ( 1 + γ + γ x ) i = 0 N ( γ λ ) i Ψ i ( α , λ , γ , x ) . (4)

For illustrative purposes, we provide a numerical study to analyze de convergence of the expansion (4) by comparing its results with those from numerical integration.Table I presents the calculations for N=5,10,15 and 20 and x=1. This expansion presents good approximations for the mean residual life. All computations are obtained using the R software. We provide the scripts for these calculations in the Appendix A.

TABLE I
Mean residual life obtained from(4) and by numerical integration.

3 - GENERATING FUNCTION AND MOMENTS

We denote by M(t) the moment generating function (mgf) of X. From Equation (2), we obtain

M ( t ) = e 1 + γ 0 { ( γ + γ x + 1 ) [ γ + α λ ( 1 + λ x ) α 1 ] γ } e x ( t γ ) ( 1 + λ x ) α d x .

Setting u=(1+λx)α, we have

M ( t ) = e α λ ( 1 + γ ) 1 u 1 α α { [ γ + γ λ 1 ( u 1 / α 1 ) + 1 ] ( γ + α λ u α 1 α ) γ } × exp { ( u 1 / α 1 ) ( t γ ) λ u } d u .

By expanding

exp { ( u 1 / α 1 ) ( t γ ) λ } = i = 0 ( u 1 / α 1 ) i ( t γ ) i λ i i ! ,

using the binomial theorem for (u1/α1)i and, after some algebra, we obtain (for γ>0 and t<γ)

M ( t ) = e 1 + γ i = 0 ( γ t λ ) i Ψ i ( α , λ , γ ) ,

where

Ψ i ( α , λ , γ ) = j = 0 i ( 1 ) j j ! ( i j ) ! [ γ λ Γ ( j + 1 α + 1 , 1 ) + γ 2 α λ 2 Γ ( j + 2 α , 1 ) + ( 1 + γ γ λ ) Γ ( j + α α , 1 ) + γ 2 ( λ 1 ) α λ 2 Γ ( j + 1 α , 1 ) ] .

The generating function M(t) is useful for computing moments of the NHL distribution by differentiation. Thus, to considering values of the parameters α,λ and γ for which the above expansion converges, the nth derivative gives

M ( n ) ( t ) = ( 1 ) n e 1 + γ i = n ( i ) n ( γ t ) i n λ i Ψ i ( α , λ , γ ) ,

where (i)n=i(i1)(in+1) is the falling factorial. Therefore,

μ n = M ( n ) ( 0 ) = ( 1 ) n e 1 + γ i = n ( i ) n γ i n λ i Ψ i ( α , λ , γ ) .

It is possible to obtain an approximation for μn by truncating the previous expansion. Hence, for N large enough,

μ n ( 1 ) n e 1 + γ i = n N ( i ) n γ i n λ i Ψ i ( α , λ , γ ) . (5)

We use the R software to compute μn from(5) and compare the results with those obtained by numerical integration. TableII presents the results for some parametric values and N=5,10,15 and 20. In fact, they illustrate that the proposed expansion provides reasonable approximations for the moments. For n=1, note that the series converges for all selected parameterizations even for very small N=5. More terms may be required when n increases. We provide the scripts for these calculations in Appendix A.

TABLE II
First four moments obtained from(5) and by numerical integration.

The central moments (μn) and cumulants (κn) of X can be determined from these raw moments using well-known relationships. An alternative expression for the skewness of X can be expressed as (MacGillivray (1986)MACGILLIVRAY H. 1986. Skewness and asymmetry: measures and orderings. Anns of Stat 14: 994-1011.)

ρ ( u ) = ρ ( u ; α , β , γ ) = Q ( 1 u ) + Q ( u ) 2 Q ( 1 / 2 ) Q ( 1 u ) Q ( u ) ,

where u(0,1) and Q()=F1() is the quantile function (qf) of X. If X follows a symmetric distribution, [Q(1u)+Q(u)]/2 equals the median of X for all u, and the numerator of ρ(u) is zero. Thus, the farther ρ(u) is from the horizontal line ρ(u)=0, means higher asymmetry. Decrescent form in ρ(u) indicates positive asymmetry. Plots of the MacGillivray skewness for some parameter values are displayed in Figure3. Note that for λ=1.0 and γ=0.1, increases in α imply less skewed to the left allowing to model data with a distribution of heavy tails. An opposite result is observed by setting α=0.5 and γ=0.1 and increasing λ. Less asymmetric distributions arise for higher values of λ.

Figure 3
Skewness of the NHL distribution for some parameter values.

4 - INCOMPLETE MOMENTS

The sth incomplete moment of X is defined by ms(y)=0yxnf(x)dx. Thus, by inserting (2) in ms(y), we have

m s ( y ) = e 1 + γ 0 y x s { ( γ + γ x + 1 ) [ γ + α λ ( 1 + λ x ) α 1 ] γ } e λ x ( 1 + λ x ) α d x .

Using the exponential and binomial expansions, we obtain

m s ( y ) = ( 1 ) s e α λ 1 + s ( 1 + γ ) i = 0 1 i ! ( γ λ ) i Ψ s , i ( y ; α , λ , γ ) , (6)

where

Ψ s , i ( y ; α , λ , γ ) = j = 0 s + i ( 1 ) j ( s + i ) j j ! [ α λ Γ y * ( j + 1 α + 1 , ) + γ 2 λ Γ y * ( j + 2 α ) + γ 2 ( λ 1 ) λ Γ y * ( j + 1 α ) + α ( λ γ + λ γ ) Γ y * ( j α + 1 ) ] ,

and Γy*(a)=γ(a,(1+λy)α)γ(a,1)=1(1+λy)αta1etdt.

We compute approximations for ms(y) by setting values in the upper limit of the sum of equation (6). The results for some parameter values and y are shown in TableIII. This expansion presents good approximations for the incomplete moments. We provide the scripts for these calculations in Appendix A.

TABLE III
Incomplete moments obtained from(6) and by numerical integration.

Equation(6) is the main result of this sec. In various practical situations, the shape of many distributions can be usefully described by the incomplete moments. For example, the mean deviations about the mean and median and Lorenz and Bonferroni curves are simple applications of the first incomplete moment. Nowadays in the literature, these curves are the most used curves in inequality analysis. Also, note that (6) can be used to approximate μn by setting large values in y.

For a given probability π, the Lorenz and Bonferroni curves are defined by B(π)=m1(q)/(πμ1) and L(π)=m1(q)/μ1, respectively, where q=Q(π) is the qf of X evaluated at π. Plots of the Lorenz and Bonferroni curves for some parameter values are displayed in Figure4.

Figure 4
(a) Bonferroni curves for the NHL model for some parameter values; (b) Lorenz curves for the NHL model for some parameter values.

Further, the dispersion of X can be measured to some extent by the mean deviations around the mean μ1 and median m given by δ1=2μ1F(μ1)2m1(μ1), and δ2=μ12m1(m), where m=Q(0.5).

5 - MAXIMUM LIKELIHOOD ESTIMATION

Let x1,,xn be a sample of size n from the NHL(α,λ,γ) distribution and 𝛉=(α,λ,γ)T the parameter vector of interest. The log-likelihood function for 𝛉 based on this sample is given by

( 𝛉 ) = n γ i = 1 n x i i = 1 n ( 1 + λ x i ) α n log ( 1 + γ ) + i = 1 n log { ( 1 + γ + γ x i ) [ γ + α λ ( 1 + λ x i ) α 1 ] γ } . (7)

The maximum likelihood estimates (MLEs) of the model parameters can be obtained by maximizing(7). There are several routines for numerical maximization in the R program (optim function), SAS (PROC NLMIXED), Ox (MaxBFGSsub-routine), among others. Alternatively, we can differentiate (7) and solve the resulting nonlinear likelihood equations using the quasi-Newton BFGS and Newton-Raphson algorithms. The score components can be obtained from the authors under request.

Under standard regularity conditions and for large n, the distribution of 𝛉̂ can be approximated by a trivariate normal N3(0,𝐉(𝛉̂)1) distribution, where 𝐉(𝛉) is the observed information matrix. In addition, this approximation can also be used for constructing confidence regions and testing hypotheses on the parameters.

6 - SIMULATION STUDY

In this sec, we provide a Monte Carlo simulation study to evaluate the adequacy of the MLEs of the parameters of the NHL distribution. The simulations are performed by generating observations from eight scenarios with different parameter combinations. Note that the cdf in (1) can only be inverted numerically. However, it is possible generating random numbers by using the qfs of the Lindley and NH distributions.

(Jodrá 2010)JODRÁ P. 2010. Computer generation of random variables with Lindley or Poisson-Lindley distribution via the Lambert W function. Math Comput Simulat 81: 851-859. used the Lambert W function to generate random numbers from the Lindley distribution. If Y Lindley(γ) has cdf FY(y), the qf of Y is

F Y 1 ( u ) = 1 1 γ 1 γ W 1 [ 1 + γ exp ( 1 + γ ) ( u 1 ) ] , 0 < u < 1 ,

where W1 is the negative branch of the Lambert W function, which can be implemented in R (lamW package) or C (GNU Scientific Library).

On the other hand, if ZNH(α,λ) has cdf FZ(z), the qf of Z is

F Z 1 ( u ) = 1 λ { [ 1 log ( 1 u ) ] 1 / α 1 } , 0 < u < 1 .

Then, we can generate an observation x from the NHL distribution as follows:

  1. Generate u1 and u2 independently from the uniform 𝒰(0,1) distribution;

  2. Calculate y=FY1(u1) (observation from the Lindley distribution);

  3. Calculate z=FY1(u2) (observation from the NH distribution);

  4. Finally, calculate x=min(y,z).

The number of Monte Carlo replications is set at N=10,000. We use the optim subroutine with analytical derivatives in R for maximizing the log-likelihood function. We compare the performance of the estimators by computing the mean estimates and root mean square errors (RMSEs) from N Monte Carlo replications. We take the sample size as n{50,100,300,500}.

The simulation results are given in Table IV . As expected, under first-order asymptotic theory, the mean estimates of the parameters tend to be closer to the true values and the RMSEs decrease when the sample size n increases for all scenarios. The MLEs of λ and γ are more accurate than those of α. It is noteworthy that the estimates of α have considerable biases, specially in small samples. Nevertheless, we can build bias-corrected point estimator through bootstrap method, see(Davison 1997)DAVISON AC. 1997. Bootstrap methods and their application. Cambridge university press..

TABLE IV
Monte Carlo results: mean estimates and RMSEs.

The non-parametric bootstrap procedure for bias correction applied in this paper can be described as follows:

  1. Letting x=x1,,xn be a observed random sample, we obtain the MLE 𝛉̂ for the NHL parameter vector 𝛉;

  2. From the original data, generate a bootstrap sample x*b=(x1*b,,xn*b), that is, take a size n random sample from x with replacement;

  3. Calculate the MLE of 𝛉 from the bootstrap sample x*b, namely 𝛉̂*;

  4. Repeat the steps (2) and (3) for a very large number B of times, thus obtaining 𝛉̂*1,,𝛉̂*𝐁;

  5. Compute the bias estimate by B(𝛉̂)=[1Bb=1B𝛉̂*𝐛]𝛉;

  6. The bias corrected estimate of 𝛉 is given by 𝛉=𝛉̂B(𝛉̂).

In order to evaluate the effectiveness of such correction, we carry out a separate simulation experiment. The numbers of Monte Carlo and bootstrap replications are equal to 1,000 and n varies in {20,30,50,70,100}. We study the performance of the bootstrap bias corrected estimates by computing the total relative bias defined in(Cribari-Neto and Soares 2003)CRIBARI-NETO F AND SOARES ACN. 2003. Inferencia em modelos heterocedásticos. Rev Bras Econ 57: 319-335., which is a measure given by the sum of the absolute values of the individual relative biases. Thus, the total relative bias is an aggregate measure of the biases of the parameters estimates.

The plots in Figure5 show the behavior of the total relative bias for the uncorrected and corrected estimators. They reveal that the corrected estimators are more reliable than the MLEs in small sample sizes. In fact, the corrected estimators outperform the usual MLEs for both scenarios. In addition, it is also observed that the asymptotic property of unbiasedness of both estimators is satisfied, because their biases decrease when the sample size increases.

Figure 5
Total relative bias of the estimator for (a) Scenario 2 and (b) Scenario 5.

7 - APPLICATIONS

In this sec, we fit the NHL distribution to two real data sets. They illustrate the potentiality of this distribution for modeling positive data. The first data set represents the times to reinfection of sexually transmitted diseases (STDs) for eight hundred and seventy seven patients. The data are taken from sec 1.12 of(Klein and Moeschberger 1997)KLEIN J AND MOESCHBERGER M. 1997. Survival Analysis Techniques for Censored and Truncated Data. New York: Springer Verlag.. The second data set corresponds to the exceedances of flood peaks (in m3/s) of the Wheaton River near Carcross in Yukon Territory, Canada. The data consist of 72 exceedances for the years 19581984, rounded to one decimal place, see (Choulakian and Stephens 2001)CHOULAKIAN V AND STEPHENS M. 2001. Goodness-of-fit for the generalized Pareto distribution. Technometrics 43: 478-484. Table5 provides a descriptive summary for both data sets.

TABLE V
Descriptive statistics.

Note that the first data set has large amplitude and variance. Its measures of central tendency, such as mean, median and mode, are quite distant when compared among them. Besides, both data sets present positive values for the skewness and kurtosis. The exceedances of flood peaks data also present large amplitude and variance, but their values are lower than those for reinfection times.

For modeling these data sets, we fit the NHL distribution and also considered the fits of six related distributions: the ENH introduced by(Lemonte 2013)LEMONTE AJ. 2013. A new exponential-type distribution with constant, decreasing, increasing, upside-down bathtub and bathtub-shaped failure rate function. Comput Stat Data An 62: 149-170. with pdf

f ( x ) = α β λ ( 1 + λ x ) α 1 exp { 1 ( 1 + λ x ) α } [ 1 exp { 1 ( 1 + λ x ) α } ] 1 β , x > 0 ,

where α>0 and β>0 are shape parameters and λ>0 is a scale parameter; the exponentiated Weibull (EW), whose pdf is

f ( x ) = α β λ x α 1 e x p ( λ x α ) [ 1 e x p ( λ x α ) ] β 1 , x > 0 ,

where α>0 and β>0 are shape parameters and λ>0 is a scale parameter; the Weibull model arises from the EW model when β=1; and the NH, Lindley and exponential distributions, which are NHL special models.

We estimate the NHL parameters and the parameters for the above models by maximum likelihood. Based on the results in sec 6, the MLEs obtained for the exceedances of flood peaks data are obtained from the bootstrap bias corrected estimators, with B=1,000. The goodness-of-fit statistics considered are: Akaike information criteria (AIC), consistent Akaike information criteria (CAIC), Kolmogorov-Smirnov (KS), Cramér-von Mises (W*) and Anderson-Darling (A*). The lower are these statistics, the better is the adjustment to the data. The MLEs and goodness-of-fit statistics are calculated using the AdequacyModel script in the R software.

Tables VI and VII list the MLEs (and the corresponding standard errors in parentheses) of the unknown parameters for the fitted models to the first and second data sets, respectively. We note that all distributions present reasonable estimates for the standard errors. Tables VII and IX present the goodness-of-fit statistics for reinfection times and flood peaks, respectively. According to these statistics, the NHL distribution provides a good fit and is quite competitive with the other current distributions.

TABLE VI
MLEs for reinfection data and corresponding standard errors in parentheses.
TABLE VII
Goodness-of-fit statistics for the models fitted to reinfection times.
TABLE VIII
MLEs for exceedances of flood peaks data and corresponding standard errors in parentheses.
TABLE IX
Goodness-of-fit statistics for the models fitted to the exceedances of flood peaks.

For both data sets, the NHL distribution yields the best fit under all goodness-of-fit statistics. Figure6 displays the histograms and the estimated densities for three competitive models according to their goodness-of-fit statistics. This graphical inspection also indicates the superiority of the new distribution to the reinfection times and flood peaks data. We also provide the TTT plots and estimated hazard functions of the NHL fitted model for the reinfection times and flood peaks data in Figures7 and8, respectively. They reveal that the NHL hazard function for both fitted models present bathtub shapes, which is in agreement with their corresponding TTT plots. Figure9 displays the plots of the estimated cdfs for the most competitive models and empirical cumulative function to both data sets. These plots illustrate the good adjustment of the NHL distribution. Hence, these results reveal that the proposed distribution can be a very effective alternative to the well-known Weibull, EW and ENH distributions, among others.

Figure 6
Histogram and estimated pdfs of (a) the NHL, ENH and EW models for the reinfection times; and (b) the NHL, ENH and Exp models for the exceedances of flood peaks.
Figure 7
(a) TTT plot; and (b) estimated hazard function for the reinfection times data.
Figure 8
(a) TTT plot; and (b) estimated hazard function for the exceedances of flood peaks data.
Figure 9
Empirical and estimated cdfs of (a) the NHL, ENH and EW models for the reinfection times; and (b) the NHL, ENH and Exp models for the exceedances of flood peaks.

8 - CONCLUSIONS

We introduce the Nadarajah-Haghighi Lindley (NHL) model by compounding the Lindley and Nadarajah-Haghighi distributions. Once we have a composition by taking the minimum of two continuous independent random variables, the proposed distribution might be useful in engineering for modeling the failure time of systems composed of two independent components in series. The NHL distribution has as special cases the Lindley, Nadarajah-Haghighi and exponential distributions. Further, it is a competitive model to the Weibull, exponentiated Weibull and exponentiated Nadarajah-Haghighi distributions, among others. We obtain some structural properties of the proposed distribution, perform the estimation of the parameters by maximum likelihood and provide two applications to real data sets. The new distribution is quite competitive to other classical lifetime models and yields good adjustments in both applications.

AUTHOR CONTRIBUTIONS

The participation of the authors in the production of the manuscript is as follows: first author, characterization of the new distribution, mathematical properties and implementation of computational routines. The second author, application, simulation studies, and computational routines. The third author, review and general correction of the paper.

10 - APPENDIX A Scripts for approximating mx by the truncated expansion (4)

Scripts for approximating μ'n by the truncated expansion (5).

Scripts for approximating ms(y) by the truncated expansion (6)

REFERENCES

  • ADAMIDIS K AND LOUKAS S. 1998. A lifetime distribution with decreasing failure rate. Stat Probabil Lett 39: 35-42
  • AL-MUTAIRI DK, GHITANY ME AND KUNDU D. 2015. Inferences on Stress-Strength Reliability from Weighted Lindley Distributions. Commun Stat Theory Methods 44: 4096-4113.
  • ASGHARZADEH A, BAKOUCH HS AND ESMAEILI L. 2013. Pareto Poisson–Lindley distribution with applications. J Appl Stat 40: 1717-1734.
  • ASHOUR SK AND ELTEHIWY MA. 2015. Exponentiated power Lindley distribution. J Adv Res 6: 895-905.
  • BAKOUCH HS, AL-ZAHRANI BM, AL-SHOMRANI AA, MARCHI VA AND LOUZADA F. 2012. An extended Lindley distribution. J Korean Stat Soc 41: 75-85.
  • BOURGUIGNON M, LIMA MDCS, LEÃO J, NASCIMENTO ADC, PINHO LGB AND CORDEIRO GM. 2015. A new generalized gamma distribution with applications. Am J Math 34: 309-342.
  • CHOULAKIAN V AND STEPHENS M. 2001. Goodness-of-fit for the generalized Pareto distribution. Technometrics 43: 478-484
  • CORDEIRO G, ORTEGA E AND LEMONTE AJ. 2014. The exponential–Weibull lifetime distribution. J Stat Comput Sim 47:626-653.
  • CRIBARI-NETO F AND SOARES ACN. 2003. Inferencia em modelos heterocedásticos. Rev Bras Econ 57: 319-335.
  • DAVISON AC. 1997. Bootstrap methods and their application. Cambridge university press.
  • GHITANY M, AL-MUTAIRI D AND NADARAJAH S. 2008a. Zero-truncated Poisson–Lindley distribution and its application. Math Comput Simulat 79: 279-287.
  • GHITANY M, ALQALLAF F, AL-MUTAIRI D AND HUSAIN H. 2011. A two-parameter weighted Lindley distribution and its applications to survival data. Math Comput Simulat 81: 1190-1201.
  • GHITANY M, ATIEH B AND NADARAJAH S. 2008b. Lindley distribution and its application. Math Comput Simulat 78: 493-506.
  • GHITANY ME, AL-MUTAIRI DK, BALAKRISHNAN N AND AL-ENEZI LJ. 2013. Power Lindley Distribution and Associated Inference. Comput Stat Data An 64: 20-33.
  • GUPTA PK AND SINGH B. 2013. Parameter estimation of Lindley distribution with hybrid censored data. Int J Syst Assur Eng Manag 4: 378-385.
  • JODRÁ P. 2010. Computer generation of random variables with Lindley or Poisson-Lindley distribution via the Lambert W function. Math Comput Simulat 81: 851-859.
  • KLEIN J AND MOESCHBERGER M. 1997. Survival Analysis Techniques for Censored and Truncated Data. New York: Springer Verlag.
  • KRISHNA H AND KUMAR K. 2011. Reliability estimation in Lindley distribution with progressively type II right censored sample. Math Comput Simulat 82: 281-294.
  • LEMONTE AJ. 2013. A new exponential-type distribution with constant, decreasing, increasing, upside-down bathtub and bathtub-shaped failure rate function. Comput Stat Data An 62: 149-170.
  • LEMONTE AJ, CORDEIRO GM AND ORTEGA EMM. 2014. On the Additive Weibull Distribution. Commun Stat Theory Methods 43: 2066-2080.
  • LINDLEY D. 1958. Fiducial distributions and Bayes’ theorem. J R Stat Soc A 20: 102-107.
  • MACGILLIVRAY H. 1986. Skewness and asymmetry: measures and orderings. Anns of Stat 14: 994-1011.
  • MARINHO PRD. 2016. Some new families of continuous distributions. Master’s thesis. Doctoral thesis, Universidade Federal de Pernambuco.
  • MAZUCHELI J AND ACHCAR J. 2011. The Lindley distribution applied to competing risks lifetime data. Comput Meth Prog Bio 104: 188-192.
  • NADARAJAH S, BAKOUCH HS AND TAHMASBI R. 2011. A generalized Lindley distribution. Sankhya B 73: 331-359.
  • NADARAJAH S AND HAGHIGHI F. 2011. An extension of the exponential distribution. Statistics 45: 543-558.
  • ORTEGA EM, LEMONTE AJ, SILVA GO AND CORDEIRO GM. 2015. New flexible models generated by gamma random variables for lifetime modeling. J Appl Stat 42: 2159-2179.
  • POPOVÍC B, RISTÍC M AND CORDEIRO G. 2015. A two parameter distribution obtained by compounding the generalized exponential and exponential distributions. Mediterr J Math 15: 2935-2949.
  • SANKARAN M. 1970. The discrete Poisson–Lindley distribution. Biometrics 26: 145-149.
  • TAHIR MH, CORDEIRO GM, ALZAATREH A, MANSOOR M AND ZUBAIR M. 2016. The logistic-X family of distributions and its applications. Commun Stat Theory Methods 45: 7326-7349.
  • XIE M AND LAI C. 1995. Reliability analysis using an additive Weibull model with bathtub shaped failure rate function. Reliab Eng Syst Safe 52: 87-93.
  • ZAMANI H AND ISMAIL N. 2010. Negative binomial-Lindley distribution and its application. J Math Stat 6: 4-9.

Publication Dates

  • Publication in this collection
    08 Apr 2019
  • Date of issue
    2019

History

  • Received
    26 Oct 2017
  • Accepted
    9 July 2018
Academia Brasileira de Ciências Rua Anfilófio de Carvalho, 29, 3º andar, 20030-060 Rio de Janeiro RJ Brasil, Tel: +55 21 3907-8100 - Rio de Janeiro - RJ - Brazil
E-mail: aabc@abc.org.br