A new model for describing remission times: the generalized beta-generated Lindley distribution

LIMA, MARIA DO CARMO S.; CORDEIRO, GAUSS M.; NASCIMENTO, ABRAÃO D.C.; SILVA, KÁSSIO F.

doi:10.1590/0001-3765201720160455

Abstract

New generators are required to define wider distributions for modeling real data in survival analysis. To that end we introduce the four-parameter generalized beta-generated Lindley distribution. It has explicit expressions for the ordinary and incomplete moments, mean deviations, generating and quantile functions. We propose a maximum likelihood procedure to estimate the model parameters, which is assessed through a Monte Carlo simulation study. We also derive an additional estimation scheme by means of least square between percentiles. The usefulness of the proposed distribution to describe remission times of cancer patients is illustrated by means of an application to real data.

GBG generator; remission times; Extended Lindley model; quantile function; Lambert function

INTRODUCTION

The statistical literature is filled with hundreds of continuous univariate distributions, see Johnson et al. (1994). Recent procedures for building meaningful distributions (called generators) have been proposed. As important generators, the two-piece approach pioneered by Hansen (1994) and the beta family defined by Eugene et al. (2002) and Jones (2004) have received prominent positions.

Many papers have applied these techniques to provide more skewness in generalizations of well-known symmetric distributions. As an example, Aas and Haff (2006) presented an extension for the Student’s t-distribution.

Using the two-piece method with a view to finance applications, Zhu and Galbraith (2010) argued that, in addition to Student’s t parameters, three shape parameters are required: one parameter to control asymmetry in the center of a distribution and two parameters to control the left and right tail behavior.

This paper addresses similar issues to Zhu and Galbraith using a different approach. We consider the generalized beta generated (GBG) family of distributions pioneered by Alexander et al. (2012), which has three shape parameters.

The Lindley (L) distribution was firstly used by Lindley (1958) in order to measure the difference between Fiducial and posterior distributions related to Bayesian analysis. Its probability density function (pdf) (for $z > 0$ ) with parameter $λ > 0$ , say L $(λ)$ , is given by

g (z; λ) = \frac{λ^{2}}{1 + λ} (1 + z) e^{- λ z},

(1)

where $λ > 0$ is a scale parameter. Its cumulative distribution function (cdf) is given by

G (z; λ) = 1 - e^{- λ z} (1 + \frac{λ z}{1 + λ}) .

(2)

Ghitany et al. (2008) discussed and studied various properties of the pdf (1). The L distribution has an important role in stress-strength reliability modeling and describes well some types of data sets, but it has lower flexibility in modeling asymmetric and/or heavy tail data. Further, it can accommodate hazard rate functions (hrfs) that are increasing, decreasing or constant but not unimodal, bathtub and other shapes, which are desirable in lifetime data analysis. To overcome this, several works proposed new distributions by adding parameters to the Lindley distribution. For example, Sankaran (2015) used such law as the mixing distribution of a Poisson parameter to generate a discrete model called the Poisson-Lindley distribution. Pararai et al. (2015) defined the Kumaraswamy Lindley-Poisson distribution and explored some of its properties. Another extension, named as the generalized Lindley distribution, was studied by Ashour and Eltehiwy (2015).

A profusion of new classes of distributions has recently proven useful to applied statisticians working in various areas of scientific investigation. Generalizing existing distributions by adding shape parameters leads to more flexible models. Let $g (x; 𝝉)$ and $G (x; 𝝉)$ be the pdf and cdf of a baseline distribution having parameter vector $𝝉$ . Alexander et al. (2012) defined the pdf and cdf of the GBG-G distribution (for $x \in 𝒳 \subseteq ℝ$ ) using three additional positive shape parameters $a$ , $b$ and $c$ by

f_{𝒢 ℬ 𝒢} (x; 𝝉, a, b, c) = c B {(a, b)}^{- 1} g (x; 𝝉) G {(x; 𝝉)}^{a c - 1} {[1 - G {(x; 𝝉)}^{c}]}^{b - 1}

(3)

and

F_{GBG} (x; 𝝉, a, b, c)

=

I (G {(x; 𝝉)}^{c}; a, b)

(4)

=

B {(a, b)}^{- 1} \int_{0}^{G {(x; 𝝉)}^{c}} ω^{a - 1} {(1 - ω)}^{b - 1} 𝑑 ω,

respectively, where $I (x; a, b)$ denotes the incomplete beta function ratio and $B (a, b)$ is the complete beta function.

In this paper, we propose a new lifetime model called the GBG-Lindley (GBGL) distribution. We also study some of its structural properties and present the maximum likelihood estimation of the parameters. A Monte Carlo study is performed in order to assess the proposed estimation procedure.

Further, we present evidence that the new model can (i) compensate the Lindley ability lack as well as (ii) produce better fits than the following distributions:

The Lindley-exponential (LE) model (Bhati and Malik 2015), whose pdf and cdf are, respectively, given by

$f_{ℒ ℰ} (x; α, λ)$

$= \frac{α^{2} λ e^{- λ x} {(1 - e^{- λ x})}^{α - 1} [1 - \log (1 - e^{- λ x})]}{1 + α} and$

$F_{ℒ ℰ} (x; α, λ)$

$= \frac{{(1 - e^{- λ x})}^{α} [1 + α - α \log (1 - e^{- λ x})]}{1 + α};$
the generalized L (GL) model (Nadarajah et al. 2012) whose pdf and cdf are, respectively, given by

$f_{𝒢 ℒ} (x; λ, c) = \frac{c λ^{2}}{1 + λ} (1 + x) e^{- λ x} {(1 - \frac{1 + λ + λ x}{1 + λ})}^{c - 1}$ (5)

and

$F_{𝒢 ℒ} (x, λ, c) = {(1 - \frac{1 + λ + λ x}{1 + λ} e^{- λ})}^{c};$ (6)
the transmuted Lindley (TL) model (Mansour and Mohamed 2015), whose pdf and cdf are, respectively, given by

$f_{𝒯 ℒ} (x; λ, θ, δ, α)$

$=$

$\frac{θ^{2}}{θ + 1} (1 + x) e^{- θ x} \times$

${(1 + λ) δ {[1 - \frac{θ + 1 + θ x}{θ + 1} e^{- θ x}]}^{δ - 1}$

$-$

$λ α {[1 - \frac{θ + 1 + θ x}{θ + 1} e^{- θ x}]}^{α - 1}}$

and

$F_{𝒯 ℒ} (x; λ, θ, δ, α)$

$= (1 + λ) {[1 - \frac{θ + 1 + θ x}{θ + 1} e^{- θ x}]}^{δ} - λ {[1 - \frac{θ + 1 + θ x}{θ + 1} e^{- θ x}]}^{α} .$

This comparison is performed in terms of both items under change in stress and the efficiency in describing remission times (in months) of cancer patients.

This paper is organized as follows. In Section 2, we introduce the GBGL distribution and provide plots of its density function and hrf. We derive linear representations for the pdf and cdf (Section 3), explicit expressions for the quantile function (qf) (Section 4), ordinary and incomplete moments, mean deviations, Bonferroni and Lorenz curves (Section 5) and generating function (Section 6). A procedure for determining the maximum likelihood estimates (MLEs) of the model parameters is addressed in Section 7. Section 8 presents empirical results for the proposed model. Concluding remarks are offered in Section 9.

THE GBGL DISTRIBUTION

Applying (1) and (2) in equations (3) and (4), the pdf and cdf of the GBGL distribution (for $x \in ℐ$ ) are, respectively, given by

f_{GBGL} (x; λ, a, b, c)

=

\frac{c λ^{2} (1 + x)}{(1 + λ) B (a, b)} e^{- λ x} {[1 - e^{- λ x} (1 + \frac{λ x}{1 + λ})]}^{ac - 1} \times

(7)

{1 - {[1 - e^{- λ x} (1 + \frac{λ x}{1 + λ})]}^{c}}^{b - 1}

and

F_{GBGL} (x; λ, a, b, c)

=

I ({[1 - e^{- λ x} (1 + \frac{λ x}{1 + λ})]}^{c}; a, b) .

(8)

For simplicity, we denote $f_{GBGL} (x; λ, a, b, c)$ and $F_{GBGL} (x; λ, a, b, c)$ by $f (x)$ and $F (x)$ . Hereafter, a random variable $X$ having density (7) is denoted by $X \sim$ GBGL $(λ, a, b, c)$ .

Clearly, the L distribution arises as the basic exemplar by taking $a = b = c = 1$ in (7). As mentioned in the introduction, we motivate the paper by comparing the performance of the new distribution with those of the L, LE and GL models fitted to a real data set.

The qf is useful for determining various mathematical properties of a distribution. For a positive random variable $X \sim F$ , the qf of $X$ is defined from the generalized inverse of its cdf for a fixed probability $u$ , namely

Q_{X} (u) = inf {x \in ℝ^{+} : u \leq F (x)}, u \in (0, 1) .

Then, the qf of the GBGL model can be determined by inverting (8) as

Q_{GBGL} (u) = Q_{L} ({[Q_{β (a, b)} (u)]}^{1 / c}),

(9)

where $Q_{β (a, b)} (u) = I^{- 1} (u; a, b)$ is the beta qf and $Q_{L} (u)$ is the qf of the L distribution with parameter $λ$ .

Consider the Lambert W-function as the principal solution for $w = W (z)$ in $z = w e^{w}$ . We have the power series expansion for $W (z) = P r o d u c t L o g [z]$ using the software Mathematica

W (z) = \sum_{i = 1}^{\infty} \frac{{(- 1)}^{i + 1} i^{i - 2} z^{i}}{(i - 1)!} .

Then, we obtain

W (z) = z - z^{2} + \frac{3 z^{3}}{2} - \frac{8 z^{4}}{3} + \frac{125 z^{5}}{24} - \frac{54 z^{6}}{5} + \frac{16807 z^{7}}{720} + O (z^{8}) .

The qf of $X$ can be expressed in terms of the Lambert function as

Q_{GBGL} (u)

= Q_{L} (Q_{β (a, b)} {(u)}^{1 / c})

= - 1 - \frac{1}{λ} - \frac{1}{λ} W (\frac{[1 + λ]}{e^{1 + λ}} [Q_{β (a, b)} {(u)}^{1 / c} - 1]),

where the last identity holds based on a result given by Jodrá (2010).

In Figure 1(c), we present one case of generation at $(λ, a, b, c) = (2, 2, 2, 2)$ based on $Q_{GBGL} (p)$ by evaluating the uniform distribution outcomes in its argument. In Figure1, we display possible shapes of the pdf and hrf of the GBGL model for some parameter values. The hrf can take the most four common forms for applications to real data: increasing, decreasing, bathtub and unimodal shapes, which is an important characteristic of the new lifetime model.

(a)
Pdfs
(b)
Hrfs
(c)
Illustration of ramon number generator

The skewness (B) and kurtosis (K) coefficients are two important tools to understand a distribution. Easy procedures to quantify $B$ and $K$ were proposed by Bowley (1920) and Moors (1984) given by, respectively: In particular, for our proposal,

B

= \frac{Q_{GBGL} (3 / 4) + Q_{GBGL} (1 / 4) - 2 Q_{GBGL} (1 / 4)}{Q_{GBGL} (3 / 4) - Q_{GBGL} (1 / 4)}

and

K

= \frac{[Q_{GBGL} (7 / 8) - Q_{GBGL} (5 / 8)] + [Q_{GBGL} (3 / 8) - Q_{GBGL} (1 / 8)]}{Q_{GBGL} (6 / 8) - Q_{GBGL} (2 / 8)} .

Figures 2(a)-2(c) and 2(d)-2(f) display GBGL skewness and kurtosis measures for some parametric points, respectively. It is known that former quantity points out how symmetrical is the model, while the second measures whether the shape of under study model is related to that due to the Gaussian law. These plots indicate that one may define symmetrical and non-symmetrical laws from our model. It is easer to specify curves with long tail to the right. Densities curves distinct from the Gaussian one are obtained.

(a)
GBGL skewness

\times

a

(b)
GBGL skewness

\times

b

(c)
GBGL skewness

\times

c

(d)
GBGL kurtosis

\times

a

(e)
GBGL kurtosis

\times

b

(f)
GBGL kurtosis

\times

c

LINEAR REPRESENTATIONS

In this section, we present linear representations for (7) and (8) in order to obtain explicit expressions for some type-moment quantities of the GBGL model. We prove that the expansions – in the form of Theorem 1 and Corollary 1 – can depend only on the GL distribution (Nadarajah et al. 2012).

Theorem 1. The cdf of $X \sim GBGL (λ, a, b, c)$ can be expressed by the linear combination

f (x) = \sum_{l = 0}^{\infty} ζ_{l} g_{l} (x),

where $g_{l} (x)$ denotes the GL density with scale and shape parameters $λ$ and $(a + l) c$ , respectively, and

ζ_{l} = \frac{{(- 1)}^{l}}{(a + l) B (a, b)} (\frac{b - 1}{l}) .

The proof of this theorem is given in Appendix A.

Corollary 1. The cdf of $X$ is given by

F (x) = \sum_{l = 0}^{\infty} ζ_{l} G_{l} (x),

where $G_{l} (x)$ denotes the GL cdf with parameters $λ$ and $(a + l) c$ .

The following results indicate that type-moment quantities of the GL model can be obtained from those corresponding quantities of the gamma distribution.

Theorem 2. The cdf of $Z \sim GL (λ, c)$ can be expressed as

G (z) = \sum_{i = 0}^{\infty} \sum_{k = 0}^{i + 1} w_{i, k} H_{i, k} (z),

(10)

where $H_{i, k} (z)$ denotes the gamma cdf with shape parameter $(k + 1)$ and scale parameter $(i + 1) λ$ , respectively,

w_{i, k} = \frac{{(- 1)}^{i} v_{i, k}}{{(1 + λ)}^{i + 2} {(1 + i)}^{k + 1}} (\frac{c - 1}{i})

(11)

and

v_{i, k} = \sum_{j = δ_{k}}^{i} λ^{j - k + 1} k! (\frac{i}{j}) (\frac{j + 1}{k}) .

The proof of this theorem is given in Appendix B.

Corollary 2. The pdf of $Z \sim GL (λ, c)$ is given by

f (z) = \sum_{i = 0}^{\infty} \sum_{k = 0}^{i + 1} w_{i, k} h_{i, k} (z),

(12)

where $h_{i, k} (z)$ denotes the gamma density with shape parameter $(k + 1)$ and scale parameter $(i + 1) λ$ .

Finally, the main result of this section provides a simple way for obtaining the properties of the new model by means of the classical gamma model.

Theorem 3. As consequences of Theorem 1 and Corollary 2, we can write the density of $X$ as

f (x) = \sum_{i = 0}^{\infty} \sum_{k = 0}^{i + 1} τ_{i, k} h_{i, k} (x),

where

τ_{i, k} = \sum_{l = 0}^{\infty} ζ_{l} w_{i, k} (l), w_{i, k} (l) = \frac{{(- 1)}^{i} v_{i, k}}{{(1 + λ)}^{i + 2} {(1 + i)}^{k + 1}} (\frac{(a + l) c - 1}{i}),

and $v_{i, k}$ and $h_{i, k} (x)$ are defined in Theorem 2 and Corollary 2, respectively.

The proof of this theorem is given in Appendix C.

QUANTILE FUNCTION

For some models, it is possible to invert the cdf. However, for some other distributions, this inverse function of cannot be obtained in closed-form. We shall resort to power series methods for the GBLG model. They are at the heart of many solutions in applied mathematics and statistics. First, based on equation (2), we have the following theorem for the qf of the L model,

Theorem 4. The L qf can be expressed as a power series

Q_{L} (u) = \sum_{n = 0}^{\infty} t_{n} u^{n},

where $t_{n} = \sum_{k = n + 1}^{\infty} {(- 1)}^{k - n} (\frac{k}{n}) π_{k}$ . The quantity $π_{k}$ and the proof of this theorem are given in Appendix D.

In the following, we use an equation of Gradshteyn and Ryzhik (2000) for a power series raised to a positive integer $j$

{(\sum_{i = 0}^{\infty} a_{i} x^{i})}^{j} = \sum_{i = 0}^{\infty} c_{j, i} x^{i},

(13)

where the coefficients $c_{j, i}$ (for $i = 1, 2, \dots$ ) are determined from the recurrence equation (for $i \geq 1$ )

c_{j, i} = {(i a_{0})}^{- 1} \sum_{m = 1}^{i} [m (j + 1) - i] a_{m} c_{j, i - m}

(14)

and $c_{j, 0} = a_{0}^{j}$ . The coefficient $c_{j, i}$ follows from $c_{j, 0}, \dots, c_{j, i - 1}$ and then from the quantities $a_{0}, \dots, a_{i}$ .

Corollary 3. The GBGL qf can be expanded as

Q_{𝐺𝐵𝐺𝐿} (u) = \sum_{j = 0}^{\infty} e_{j} u^{j / a},

(15)

where $e_{j} = \sum_{i, r = 0}^{\infty} t_{i} s_{r} (i / c) η_{r, j}$ , $η_{r, j} = {(j {\bar{θ}}_{0})}^{- 1} \sum_{m = 1}^{j} [m (r + 1) - j] {\bar{θ}}_{m} η_{r, j - m}$ and ${\bar{θ}}_{i}$ is given in Appendix D.

MOMENTS

Henceforth, let $Y_{i, k} \sim Gamma (k + 1, (i + 1) λ)$ . Next, we obtain the ordinary and incomplete moments of $X$ from the corresponding moments of $Y_{i, k}$ . Based on Theorem 3, we can write

μ_{n}^{'} = E (X^{n}) = \sum_{i = 0}^{\infty} \sum_{k = 0}^{i + 1} τ_{i, k} E (Y_{i, k}^{n}) .

We have the following corollary from the moments of $Y_{i, k}$ .

Corollary 4. Suppose that $μ_{n}^{'} = E (X^{n})$ exists. Then,

μ_{n}^{'} = E (X^{n}) = \sum_{i = 0}^{\infty} \sum_{k = 0}^{i + 1} τ_{i, k} {[(i + 1) λ]}^{n} {(k + 1)}^{[n]},

(16)

where $k^{[n]} = k (k + 1) \dots (k + n - 1), n \in N$ .

Further, we can express $μ_{n}^{'}$ in terms of $Q_{L} (u)$ as

μ_{n}^{'} = \sum_{i = 0}^{\infty} \sum_{k = 0}^{i + 1} τ_{i, k} \int_{0}^{1} Q_{L} {(u)}^{n} u^{a (i + c) - 1} 𝑑 u .

Thus, an alternative expansion for $μ_{n}^{'}$ can be obtained from Theorem 4 in the following corollary.

Corollary 5. Suppose that $μ_{n}^{'} = E (X^{n})$ exists. Then,

μ_{n}^{'} = \sum_{i, j = 0}^{\infty} \sum_{k = 0}^{i + 1} \frac{τ_{i, k} f_{n, j}}{[a (i + 1) + j]},

(17)

where the quantities $f_{n, j}$ are determined from (13)-(14) as

e_{n, j} = {(i t_{0})}^{- 1} \sum_{m = 1}^{j} [m (n + 1) - j] t_{m} e_{n, j - m}

for $j \geq 1$ , $f_{n, 0} = t_{0}^{n}$ , $t_{m} = \sum_{l = m + 1}^{\infty} {(- 1)}^{l - m} (\frac{l}{m}) π_{l}$ and the quantity $π_{l}$ is defined in Appendix D.

Next, we obtain the incomplete moments of $X$ .

Corollary 6. Suppose that the $n$ th incomplete moment of $X$ , say $T_{n} (y) = \int_{0}^{y} x^{n} f (x) dx$ , exists. Then,

T_{n} (y)

=

\int_{0}^{y} x^{n} \sum_{i = 0}^{\infty} \sum_{k = 0}^{i + 1} w_{i, k} \frac{(λ + λ i)}{k!} {[(k + 1) x]}^{λ + λ i - 1} \exp [- (λ + λ i) x] 𝑑 x

(18)

=

\sum_{i = 0}^{\infty} ϵ_{i} y^{λ + λ i + n} {[(λ + λ i) y]}^{- (λ + λ i + n)}

\times

[Γ (λ + λ i + n) - Γ (λ + λ i + n, (λ + λ i) y)],

where the quantity $w_{i, k}$ is defined in (11) and $ϵ_{i} = \sum_{k = 0}^{i + 1} w_{i, k} \frac{λ (1 + i)}{k!} {(k + 1)}^{λ (1 + i) - 1}$ .

Equations (16), (17) and (18) are the main results of this section.

The amount of scatter in a population is evidently measured to some extent by the totality of deviations from the mean and median given by $δ_{1} = 𝔼 (| X - μ_{1}^{'} |)$ and $δ_{2} = 𝔼 (| X - M |)$ , respectively. They can be expressed in terms of the first incomplete moment by $δ_{1} = 2 μ_{1}^{'} F (μ_{1}^{'}) - 2 T_{1} (μ_{1}^{'})$ and $δ_{2} = μ_{1}^{'} - 2 T_{1} (M)$ , respectively, where $F (μ_{1}^{'})$ follows from (8) and $T_{1} (\cdot)$ is the first incomplete moment given by (18) with $n = 1$ .

Another important application of the first incomplete moment refers to the Bonferroni and Lorenz curves defined (for a given probability $p$ ) by $L (p) = T_{1} (x_{p}) / μ_{1}^{'}$ and $B (p) = T (x_{p}) / (p μ_{1}^{'})$ , respectively, where $x_{p}$ can be evaluated numerically by (9) with $u = p$ . These curves are very useful in economics, demography, insurance, engineering and medicine.

Figure3 displays plots of the Bonferroni and Lorenz curves for selected parameter values.

(a)
Bonferroni curves
(b)
Lorenz curves

The $n$ th moment of the residual life, say $v_{n} (t) = E [{(X - t)}^{n} ∣ X > t]$ (for $n = 1, 2, \dots$ ) uniquely determines $F (x)$ . It is given by $v_{n} (t) = \frac{1}{R (t)} \int_{t}^{\infty} {(x - t)}^{n} f (x) dx$ , which is easily obtained from (18). A special case is the mean residual life (MRL) function at age $t$ given by $v_{1} (t) = E [(X - t) ∣ X > t]$ , which represents the expected additional life length for a unit which is alive at age $t$ .

The $n$ th moment of the reversed residual life given by $M_{n} (t) = \frac{1}{F (t)} \int_{0}^{t} {(t - x)}^{n} f (x) dx$ , (for $t > 0$ and $n = 1, 2, \dots$ ) uniquely determines $F (x)$ and follows from $v_{n} (t)$ .

GENERATING FUNCTION

A first representation for the moment generating function (mgf) $M (s)$ of $X$ can be based on the L qf. We can write

M (s) = \int_{0}^{1} \exp [s Q_{G L} (u)] d u .

Expanding the exponential function, and after some algebra using (15), we have the following corollary.

Corollary 7. The mgf of $X$ can be expressed as

M (s) = c B {(a, b + 1)}^{- 1} \sum_{i = 0}^{\infty} {(- 1)}^{i} (\frac{b}{i}) ρ (s, a [i + c] - 1),

(19)

where

ρ (s, a [i + c] - 1)

=

\int_{0}^{1} \exp [s Q_{L} (u)] u^{a (i + c) - 1} 𝑑 u = \sum_{j, k = 0}^{\infty} \frac{s^{k} d_{k, j}}{[a (i + c) + j] k!},

$d_{k, j} = {(j t_{0})}^{- 1} \sum_{m = 1}^{j} [m (k + 1) - j] t_{m} d_{k, j - m}$ (for $j \geq 1$ ), $d_{k, 0} = t_{0}^{k}$ and the coefficients $t_{j}^{'}$ s are defined in Theorem 4.1.

A second representation for $M (s)$ comes from the gamma generating function. We can write

M (s) = \sum_{i = 0}^{\infty} \sum_{k = 0}^{i + 1} w_{i, k} M_{i, k} (s),

where $w_{i, k}$ is defined by (11) and $M_{i, k} (s)$ is the mgf of $Y_{i, k}$ given by

M_{i, k} (s) = \frac{1}{{[1 - λ s (1 + i)]}^{k + 1}}, s < λ^{- 1} .

(20)

Equations (19) and (20) are the main results of this section.

ESTIMATION

Several approaches for parameter estimation were proposed in the statistical literature but the maximum likelihood method is the most commonly employed. The MLEs enjoy desirable properties for constructing confidence intervals. In this section, we investigate the estimation of the parameters of the GBGL distribution by maximum likelihood for complete data sets. Alternatively, we propose other estimation procedure that rely on squared distance between theoretical and empirical GBGL quantiles. Both estimation methods will be compared in the next section of numerical results.

MAXIMUM LIKELIHOOD ESTIMATION

Consider a random variable $X \sim$ GBGL $(a, b, c, λ)$ and let $𝜽 = {(a, b, c, λ)}^{T}$ be the parameter vector. Thus, the associated log-likelihood function for one observation $x$ is

ℓ (𝜽; x)

=

\log (c) + 2 \log (λ) + \log (1 + x) - \log (1 - λ) - n \log [B (a, b)] - λ x

(21)

+

(a c - 1) \log [1 - e^{- λ x} (1 + \frac{λ x}{1 + λ})]

+

(b - 1) \log {1 - {[1 - e^{- λ x} (1 + \frac{λ x}{1 + λ})]}^{c}} .

The MLE of $𝜽$ is determined by maximizing $l_{n} (𝜽) = \sum_{i = 1}^{n} ℓ (𝜽; x_{i})$ for a given data set $x_{1}, \dots, x_{n}$ . Equation (21) can be maximized either directly by using the R (optim function), SAS (PROC NLMIXED), Ox program (sub-routine MaxBFGS) or by solving the nonlinear likelihood equations obtained by differentiating this equation.

Based on equation (21), the components of the unit score function

𝕌 (𝜽) = (U_{a}, U_{b}, U_{c}, U_{λ}) = (\frac{\partial ℓ (𝜽; x)}{\partial a}, \frac{\partial ℓ (𝜽; x)}{\partial b}, \frac{\partial ℓ (𝜽; x)}{\partial c}, \frac{\partial ℓ (𝜽; x)}{\partial λ})

are given by

U_{a} = U_{a} (𝜽)

=

- ψ (a) + ψ (a + b) + c \log {1 - e^{- λ x} [1 + \frac{λ x}{1 + λ}]},

U_{b} = U_{b} (𝜽)

=

- ψ (b) + ψ (a + b) + \log {1 - {1 - e^{- λ x} [1 + \frac{λ x}{1 + λ}]}^{c}},

U_{c} = U_{c} (𝜽)

=

\frac{1}{c} + a \log [1 - e^{- λ x} (1 + \frac{λ x}{1 + λ})]

-

\frac{(b - 1) {[1 - e^{- λ x} (1 + \frac{λ x}{1 + λ})]}^{c} \log [1 - e^{- λ x} (1 + \frac{λ x}{1 + λ})]}{1 - {[1 - e^{- λ x} (1 + \frac{λ x}{1 + λ})]}^{c}}

and

U_{λ} = U_{λ} (𝜽)

=

\frac{2}{λ} - \frac{1}{1 + λ} - x

+

\frac{(a c - 1) {x e^{- λ x} (1 + \frac{λ x}{1 + λ}) - e^{- λ x} [\frac{x}{1 + λ} - \frac{λ x}{{(1 + λ)}^{2}}]}}{1 - e^{- λ x} (1 + \frac{λ x}{1 + λ})}

-

\frac{c (b - 1) {x e^{- λ x} (1 + \frac{λ x}{1 + λ}) - e^{- λ x} [\frac{x}{1 + λ} - \frac{λ x}{{(1 + λ)}^{2}}]}}{1 - {[1 - e^{- λ x} (1 + \frac{λ x}{1 + λ})]}^{c}}

\times

{[1 - e^{- λ x} (1 + \frac{λ x}{1 + λ})]}^{c - 1},

where $ψ (\cdot)$ is the digamma function.

Although these equations cannot be solved analytically, a numerical solution can be determined by using computing packages. Iterative techniques such as Newton-Raphson type algorithms can be adopted to obtain the MLEs.

For interval estimation and hypothesis tests on the model parameters, we require the observed information matrix. The $4 \times 4$ unit observed information matrix,

J = J (𝜽) \equiv (\begin{matrix} J_{a a} & J_{a b} & J_{a c} & J_{a λ} \\ J_{b a} & J_{b b} & J_{b c} & J_{b λ} \\ J_{c a} & J_{c b} & J_{c c} & J_{c λ} \\ J_{λ a} & J_{λ b} & J_{λ c} & J_{λ λ} \end{matrix}),

where $J_{r s} = - \partial^{2} ℓ (𝜽; x) / \partial θ_{r} \partial θ_{s}$ , is given in Appendix E. Likelihood ratio tests can be performed for the new distribution in the usual way.

LEAST SQUARE ESTIMATION

An alternative estimation to the maximum likelihood method is the least square estimation discussed by Ashour and Eltehiwy (2015). For the GBGL model, the least square estimates (LSEs), $\hat{a}$ , $\hat{b}$ , $\hat{c}$ and $\hat{λ}$ of $a, b, c$ and $λ$ are defined as those arguments that minimize the objective function:

Q (a, b, c, λ) = \sum_{i = 1}^{n} {I ({[1 - e^{- λ x_{(i)}} (1 + \frac{λ x_{(i)}}{1 + λ})]}^{c}; a, b) - \frac{i}{n + 1}}^{2},

where $x_{(i)}$ is a possible outcome of the $i$ th order statistic based on a n-points random sample obtained from $X \sim GBGL (a, b, c, λ)$ .

The minimum point $(\hat{a}, \hat{b}, \hat{c}, \hat{λ})$ can also be given as a solution of the following system of non-linear equations:

\frac{\partial Q (a, b, c, λ)}{\partial a} = \frac{\partial Q (a, b, c, λ)}{\partial b} = \frac{\partial Q (a, b, c, λ)}{\partial c} = \frac{\partial Q (a, b, c, λ)}{\partial λ} = 0,

where the $i$ th components in the sums are

\frac{\partial Q_{i} (a, b, c, λ)}{\partial a} =

2 {I ({[1 - e^{- λ x_{(i)}} (1 + \frac{λ x_{(i)}}{1 + λ})]}^{c}; a, b) - \frac{i}{n + 1}}

\times {- \frac{B^{(a)} (a, b)}{B (a, b)} I ({[1 - e^{- λ x_{(i)}} (1 + \frac{λ x_{(i)}}{1 + λ})]}^{c}; a, b)

+ \frac{1}{B (a, b)} \int_{0}^{{1 - e^{- λ x_{(i)}} [1 + \frac{λ x_{(i)}}{1 + λ}]}^{c}} \log (w) ω^{a - 1} {(1 - ω)}^{b - 1} d ω},

\frac{\partial Q_{i} (a, b, c, λ)}{\partial b} =

2 {I ({[1 - e^{- λ x_{(i)}} (1 + \frac{λ x_{(i)}}{1 + λ})]}^{c}; a, b) - \frac{i}{n + 1}}

\times {- \frac{B^{(b)} (a, b)}{B (a, b)} I ({[1 - e^{- λ x_{(i)}} (1 + \frac{λ x_{(i)}}{1 + λ})]}^{c}; a, b)

+ \frac{1}{B (a, b)} \int_{0}^{{1 - e^{- λ x_{(i)}} [1 + \frac{λ x_{(i)}}{1 + λ}]}^{c}} \log (1 - w) ω^{a - 1} {(1 - ω)}^{b - 1} d ω},

\frac{\partial Q_{i} (a, b, c, λ)}{\partial c} =

2 {I ({[1 - e^{- λ x_{(i)}} (1 + \frac{λ x_{(i)}}{1 + λ})]}^{c}; a, b) - \frac{i}{n + 1}}

\times {\frac{1}{B (a, b)} {[1 - e^{- λ x_{(i)}} (1 + \frac{λ x_{(i)}}{1 + λ})]}^{c (a - 1)}

{1 - {[1 - e^{- λ x_{(i)}} (1 + \frac{λ x_{(i)}}{1 + λ})]}^{c}}^{b - 1}

{[1 - e^{- λ x_{(i)}} (1 + \frac{λ x_{(i)}}{1 + λ})]}^{c} \log [1 - e^{- λ x_{(i)}} (1 + \frac{λ x_{(i)}}{1 + λ})]}

and

\frac{\partial Q_{i} (a, b, c, λ)}{\partial λ} =

2 {I ({[1 - e^{- λ x_{(i)}} (1 + \frac{λ x_{(i)}}{1 + λ})]}^{c}; a, b) - \frac{i}{n + 1}}

\times {\frac{c}{B (a, b)} {[1 - e^{- λ x_{(i)}} (1 + \frac{λ x_{(i)}}{1 + λ})]}^{a c - 1}

{1 - {[1 - e^{- λ x_{(i)}} (1 + \frac{λ x_{(i)}}{1 + λ})]}^{c}}^{b - 1}

{x e^{- λ x} [1 + \frac{λ x}{1 + λ} - \frac{1}{(1 + λ)}] - \frac{λ e^{- λ x}}{{(1 + λ)}^{2}}}} .

Here, $B^{(a)} (a, b) = \partial B (a, b) / \partial a = B (a, b) [ψ (a) - ψ (a + b)]$ , $B^{(b)} (a, b) = \partial B (a, b) / \partial a = B (a, b) [ψ (b) - ψ (a + b)]$ and (obtained by Mathematica)

\int_{0}^{{[1 - e^{- λ x_{(i)}} (1 + \frac{λ x_{(i)}}{1 + λ})]}^{c}} \log (w) w^{a - 1} {(1 - w)}^{b - 1} d w =

- {[1 - e^{- λ x_{(i)}} (1 + \frac{λ x_{(i)}}{1 + λ})]}^{c} Γ {(a)}^{2}

\times H_{q}^{p} ({a, a, 1 - b}, {1 + a, 1 + a}, {[1 - e^{- λ x_{(i)}} (1 + \frac{λ x_{(i)}}{1 + λ})]}^{c})

+ c I ({[1 - e^{- λ x_{(i)}} (1 + \frac{λ x_{(i)}}{1 + λ})]}^{c}; a, b) \log [1 - e^{- λ x_{(i)}} (1 + \frac{λ x_{(i)}}{1 + λ})]

and $H_{q}^{p} ({\cdot, \cdot, \cdot}, {\cdot, \cdot, \cdot}, \cdot)$ represents the hypergeometric function.

NUMERICAL RESULTS

SIMULATION STUDY

We perform a Monte Carlo simulation study (with 1,000 replications) to quantify some asymptotic properties of both MLEs and LSEs of GBGL parameters. We also measure both the effects of the MLEs and LSEs for the additional parameters, ${(\hat{a}, \hat{b}, \hat{c})}^{⊤}$ , over the corresponding estimators of the baseline parameter, $\hat{λ}$ , and reciprocally.

To that end, we consider $λ \in {0.5, 1, 2, 3, 4}$ , $a = b = c \in {2, 5}$ and sample size $n \in {50, 100, 150}$ . Additionally, as figures of merits, we consider the average estimates due to MLEs and LSEs and their mean square errors (MSEs). The simulation results are given in TableI andII.

As expected,the MSEs and biases for the two proposed procedures tend to decrease when the sample size increases. Additionally, increasing the additional parameters implies that the MLE and LSE of $λ$ will have smaller MSEs and biases. Real scenarios having higher additional parameters will conduct to more biased MLEs. Moreover, for approximately 83% of cases, MLEs outperform LSEs in terms of MSEs.

Thumbnail

TABLE I
Simulation results for MLEs.

Thumbnail

TABLE II
Simulation results for LSEs.

APPLICATIONS TO REAL DATA

In this section, we perform two applications to real data sets. Initially, we consider data obtained from accelerated life testing of 40 items with change in stress from 100 to 150 at an time instant (Murthy et al. 2004, p. 236, Dataset 12.2). In this first study, we aim to compare Lindley and GBGL models and, for such end, we use the likelihood ratio statistic to test the hypothesis $H_{0} : a = b = c = 1 \Leftrightarrow H_{0} : G B G L \equiv L i n d l e y$ . TableIII and Figure4 display associated main results. One can note that baseline and proposed models are statistically distinct for any nominal level higher than $4 %$ . Fits with respect both empirical density and cumulative distribution function confirm that our model describe data better than the Lindley model.

Thumbnail

TABLE III
MLEs of fitted models to Stress data and likelihood ratio statistics.

Second, our aim is also to explain remission times (in months) of a random sample of 128 bladder cancer patients (Lee and Wang 2003). To that end, we consider the GBGL distribution, the Lindley baseline, and other three extended Lindley models, namely the LE, GL and TE distributions described in Section 1. Table IV lists the MLEs and their standard errors (SEs) for each fitted model. One can note that all estimates are statistically significant. The plots in Figure5 display the empirical pdf and cdf and the fitted versions for the three best models according to the subsequent discussion.

Both GBGL and LE models describe well the empirical density of the remission times, but only our proposed model fits well the empirical cdf.

(a)
Pdf: GBGL

\times

L
(b)
Cdf: GBGL

\times

L

(a)
For pdfs.
(b)
For cdfs.

Thumbnail

TABLE IV
MLEs of the fitted models to the current data.

In order to compare quantitatively the competitive models, we adopt two criteria: the Akaike Information Criterion (AIC) and Kolmogorov-Smirnov (KS) statistic. These statistics are widely used to determine how closely a specific cdf fits the associated empirical distribution for a given data set. The smaller these statistics are, the better the fit is.

TableV presents the values of these statistics for some models. The GBGL model provides the best fit to these data among the current models. Thus, our proposal can be a competitive distribution compared with other extended Lindley models: L, L exponential (Bhati and Malik 2015) and GL.

Thumbnail

TABLE V
Goodness-of-fit measures.

CONCLUSIONS

In this paper, we propose a new four-parameter distribution called the generalized beta-generated Lindley (GBGL) model. Some of its structural properties (such as the moments and generating function) have been derived from a linear representation for the GBGL density function. We propose a procedure for determining the maximum likelihood estimates (MLEs) of the model parameters. A simulation study is performed to validate the MLEs. We also have indicated an additional estimation process based on the least square method between percentiles. Finally, two applications to real data sets provide evidence that the proposed model can be better than the Lindley model and some of its extensions, namely the exponentiated Lindley and generalized Lindley distributions.

ACKNOWLEDGMENTS

The authors also acknowledge partial support from Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq), Brazil.

REFERENCES

¹
AAS K AND HAFF IH. 2006. The generalized hyperbolic skew Student’s t-distribution. J Financial Econom 4: 275-309.
²
ALEXANDER C, CORDEIRO GM, ORTEGA EMM AND SARABIA JM. 2012. Generalized beta-generated distributions. Comput Stat Data Anal 56: 1880-1897.
³
ASHOUR SK AND ELTEHIWY MA. 2015. Exponentiated Power Lindley distribution. J Adv Res 6: 895-905.
⁴
BHATI D AND MALIK MA. 2015. On Lindley-exponential distribution: properties and application. Metron 73: 335-357.
⁵
BOWLEY AL. 1920. Elements of statistics. Scribner’s sons, New York.
⁶
CORDEIRO GM AND LEMONTE AJ. 2011. The β-birnbaum–Saunders distribution: An improved distribution for fatigue life modeling. Comput Stat Data Anal 55: 1445-1461.
⁷
EUGENE N, LEE C AND FAMOYE F. 2002. Beta-normal distribution and its applications. Commun Stat Theory Methods 31:497-512.
⁸
GHITANY M, ATIEH B AND NADARAJAH S. 2008. Lindley distribution and its applications. Math Comput Simul 78: 493-506.
⁹
GRADSHTEYN IS AND RYZHIK IM. 2000. Table of Integrals, Series, and Products. Academic Press, San Diego.
¹⁰
HANSEN BE. 1994. Autoregressive conditional density estimation. Int Econom Rev 35: 705-730.
¹¹
JODRÁ P. 2010. Computer generation of random variables with Lindley or Poisson-Lindley distribution via the Lambert W function. Math Comput Simul 81: 851-859.
¹²
JOHNSON NL, KOTZ S AND BALAKRISHNAN N. 1994. Continuous Univariate Distributions I. Wiley, New York.
¹³
JONES MC. 2004. Families of distributions arising from distributions of order statistics. Test 13: 1-43.
¹⁴
LEE ET AND WANG JW. 2003. Statistical methods for survival data analysis. J Wiley e Sons, New Jersey.
¹⁵
LINDLEY D. 1958. Fiducial distributions and Bayes’ theorem. J R Stat Soc Series B Stat Methodol 20: 102-107.
¹⁶
MANSOUR M AND MOHAMED S. 2015. A New Generalized of Transmuted Lindley Distribution. Appl Math Sci 55: 2729-2748.
¹⁷
MOORS JJA. 1984. A Quantile Alternative for Kurtosis. J R Stat Soc Ser D 37: 25-32.
¹⁸
MURTHY DNP, XIE M AND JIANG R. 2004. Weilbul Models. J Wiley e Sons, New Jersey.
¹⁹
NADARAJAH S, BAKOUCH HS AND TAHMASBI R. 2012. A generalized Lindley distribution. Sankhya Ser B 73: 331-359.
²⁰
PARARAI M, OLUYEDE BO AND WARAHENA-LIYANAGE G. 2015. Kumaraswamy Lindley-Poisson distribution: theory and applications. Asian J Math Appl 2015: 1-30.
²¹
SANKARAN M. 2015. The discrete Poisson-Lindley distribution. Biometrics 26: 145-149.
²²
ZHU D AND GALBRAITH JW. 2010. A generalized asymmetric Student-t distribution with application to financial econometrics. J Econom 157: 297-305.

Appendix A Proof of Theorem 1

In this section, we prove that both the GBGL density and cdf, say $f (x)$ and $F (x)$ , respectively, can be represented as linear combinations of GL densities and cdfs.

From equation (7), we have

f (x) = \frac{c λ^{2} (1 + x)}{(1 + λ) B (a, b)} e^{- λ x} {[1 - e^{- λ x} (1 + \frac{λ x}{1 + λ})]}^{ac - 1} A (x; λ, b, c),

where

A (x; λ, b, c) = {1 - {[1 - e^{- λ x} (1 + \frac{λ x}{1 + λ})]}^{c}}^{b - 1} .

Using the power series, we obtain

A (x; λ, b, c) = \sum_{n = 0}^{\infty} {(- 1)}^{n} (\frac{b - 1}{n}) {[1 - e^{- λ x} (1 + \frac{λ x}{1 + λ})]}^{n c} .

Then, we can write

f (x; λ, a, b, c)

=

\frac{c λ^{2} (1 + x)}{(1 + λ) B (a, b)} e^{- λ x} {[1 - e^{- λ x} (1 + \frac{λ x}{1 + λ})]}^{ac - 1} \times

\times

\sum_{n = 0}^{\infty} {(- 1)}^{n} (\frac{b - 1}{n}) {[1 - e^{- λ x} (1 + \frac{λ x}{1 + λ})]}^{n c}

=

\sum_{n = 0}^{\infty} ζ_{n} g_{n} (x),

where

ζ_{n} = \frac{{(- 1)}^{l}}{(a + n) B (a, b)} (\frac{b - 1}{n})

and $g_{n} (x)$ denotes the GL density with parameters $λ$ and $(a + n) c$ .

Thus, the corresponding cdf is given by

F (x; λ, a, b, c) = \sum_{n = 0}^{\infty} ζ_{n} G_{n} (x) .

Appendix B: Proof of Theorem 2

Let

J (x; c, λ) = \int_{0}^{x} (1 + s) e^{- λ s} {[1 - \frac{1 + λ + λ s}{1 + λ} e^{- λ s}]}^{c - 1} ds .

Using the power series expansion, we obtain

J (x; c, λ)

=

\sum_{i = 0}^{\infty} \frac{{(- 1)}^{i} (\frac{c - 1}{i})}{{(1 + λ)}^{i}} \int_{0}^{x} (1 + s) {(1 + λ + λ s)}^{i} \exp (- λ i s - λ s) ds

=

\sum_{i = 0}^{\infty} \frac{{(- 1)}^{i} (\frac{c - 1}{i})}{{(1 + λ)}^{i}} \sum_{j = 0}^{i} (\frac{i}{j}) λ^{j} \int_{0}^{x} {(1 + s)}^{j + 1} \exp [- λ s (1 + i)] ds

=

\sum_{i = 0}^{\infty} \sum_{j = 0}^{i} \sum_{k = 0}^{j + 1} \frac{{(- 1)}^{i} λ^{j} (\frac{c - 1}{i}) (\frac{i}{j})}{{(1 + λ)}^{i}} \int_{0}^{x} s^{(k + 1) - 1} \exp [- (1 + i) λ s] ds .

Further,

F (x)

=

\frac{c λ^{2}}{1 + λ} J (x; c, λ) = \sum_{i = 0}^{\infty} \sum_{j = 0}^{i} \sum_{k = 0}^{j + 1} \frac{{(- 1)}^{i} λ^{j - k + 1} k!}{{(1 + λ)}^{i + 2} {(1 + i)}^{k + 1}}

\times

(\frac{c - 1}{i}) (\frac{i}{j}) (\frac{j + 1}{k}) H_{i, k} (x),

where $H_{i, k} (x)$ denotes the gamma cdf with shape parameter $(k + 1)$ and scale parameter $(i + 1) λ$ .

We can change $\sum_{j = 0}^{i} \sum_{k = 0}^{j + 1}$ by $\sum_{k = 0}^{i + 1} \sum_{j = δ_{k}}^{i}$ , where $δ_{0} = 0$ for $k = 1, 2$ and $δ_{k} = k - 1$ for $k \geq 2$ , which is very easy to prove by a cartesian plot of $k$ versus $j$ . Then, we have

F (x) = \sum_{i = 0}^{\infty} \sum_{k = 0}^{i + 1} \sum_{j = δ_{k}}^{i} \frac{{(- 1)}^{i} λ^{j - k + 1} k!}{{(1 + λ)}^{i + 2} {(1 + i)}^{k + 1}} (\frac{c - 1}{i}) H_{i, k} (x)

and rearranging terms, we obtain

F (x) = \sum_{i = 0}^{\infty} \sum_{k = 0}^{i + 1} \frac{{(- 1)}^{i} v_{i, k}}{{(1 + λ)}^{i + 2} {(1 + i)}^{k + 1}} (\frac{c - 1}{i}) H_{i, k} (x),

where

v_{i, k} = \sum_{j = δ_{k}}^{i} λ^{j - k + 1} k! (\frac{i}{j}) (\frac{j + 1}{k}) .

Setting

w_{i, k} = \frac{{(- 1)}^{i} v_{i, k}}{{(1 + λ)}^{i + 2} {(1 + i)}^{k + 1}} (\frac{c - 1}{i}),

the new cdf follows as a double linear combination of gamma cdfs

F (x) = \sum_{i = 0}^{\infty} \sum_{k = 0}^{i + 1} w_{i, k} H_{i, k} (x) .

By differentiating the last equation, we obtain

f (x) = \sum_{i = 0}^{\infty} \sum_{k = 0}^{i + 1} w_{i, k} h_{i, k} (x),

where $h_{i, k} (x)$ denotes the gamma density with parameters $(k + 1)$ and $(i + 1) λ$ .

Appendix C: Proof of Theorem 3

We can write from equations (10) and (12)

f (x) = \sum_{l, i = 0}^{\infty} \sum_{k = 0}^{i + 1} ζ_{l} w_{i, k} (l) h_{i, k} (x) \sum_{i = 0}^{\infty} \sum_{k = 0}^{i + 1} τ_{i, k} h_{i, k} (x),

where

τ_{i, k} = \sum_{l = 0}^{\infty} ζ_{l} w_{i, k} (l), w_{i, k} (l) = \frac{{(- 1)}^{i} v_{i, k}}{{(1 + λ)}^{i + 2} {(1 + i)}^{k + 1}} (\frac{(a + l) c - 1}{i})

and $v_{i, k}$ is defined in Theorem 1.

Appendix D: Quantile function

We derive a power series for $Q_{G L} (u)$ following the steps. First, we use a power series for $Q^{- 1} (a, 1 - u)$ . Second, we obtain a power series for the argument $1 - \exp [- Q^{- 1} (a, 1 - u)]$ . Third, we derive a power series for the L qf using the Lagrange theorem in order to obtain a power series for $Q_{G L} (u)$ .

We introduce the following quantities defined by Cordeiro and Lemonte (2011). Let $Q^{- 1} (a, z)$ be the inverse function of

Q (a, z) = 1 - \frac{γ (a, z)}{Γ (a)} = \frac{Γ (a, z)}{Γ (a)} = u .

A power series for $Q^{- 1} (a, 1 - u)$ is given in the Wolfram website ¹ ¹http://functions.wolfram.com/GammaBetaErf/InverseGammaRegularized/06/01/03/ as

Q^{- 1} (a, 1 - u)

=

w + \frac{w^{2}}{a + 1} + \frac{(3 a + 5) w^{3}}{2 {(a + 1)}^{2} (a + 2)} + \frac{[a (8 a + 33) + 31] w^{4}}{3 {(a + 1)}^{3} (a + 2) (a + 3)}

+

\frac{{a (a [a (125 a + 1179) + 3971] + 5661) + 2888} w^{5}}{24 {(a + 1)}^{4} {(a + 2)}^{2} (a + 3) (a + 4)} + 𝑂 (w^{6}),

where $w = {[u Γ (a + 1)]}^{1 / a}$ . We can write the last equation as

z = Q^{- 1} (a, 1 - u) = \sum_{r = 0}^{\infty} δ_{r} u^{r / a},

where $δ_{i} = {\bar{b}}_{i} Γ {(a + 1)}^{i / a}$ . Here, ${\bar{b}}_{0} = 0$ , ${\bar{b}}_{1} = 1$ and any coefficient ${\bar{b}}_{i + 1}$ (for $i \geq 1$ ) can be obtained from the cubic recurrence equation

{\bar{b}}_{i + 1} =

\frac{1}{i (a + i)} {\sum_{r = 1}^{i} \sum_{s = 1}^{i - s + 1} {\bar{b}}_{r} {\bar{b}}_{s} {\bar{b}}_{i - r - s + 2} s (i - r - s + 2)

\times \sum_{r = 2}^{i} {\bar{b}}_{r} {\bar{b}}_{i - r + 2} r [r - a - (1 - a) (i + 2 - r)]} .

We have ${\bar{b}}_{2} = 1 / (a + 1)$ , ${\bar{b}}_{3} = (3 a + 5) / [2 {(a + 1)}^{2} (a + 2)]$ , etc. Next, we present some algebraic details for the GL qf, say $Q_{G L} (u)$ . The cdf of $X$ is given by (8). By inverting $F (x) = u$ , we obtain (9). We can determine the L qf using the Lagrange theorem. We consider that the power series expansion holds

x = G (u) = x_{0} + \sum_{k = 1}^{\infty} f_{k} {(u - u_{0})}^{k}, f_{1} = G^{'} (u) \neq 0,

where $G (u)$ is analytic at a point $u_{0}$ that gives a simple $x_{0}$ -point.

Then, the inverse function $G^{- 1} (x)$ exists and is single-valued in the neighborhood of the point $x = x_{0}$ . The inverse power series $x = Q_{L} (u)$ is given by

x = Q_{L} (u) = u_{0} + \sum_{k = 1}^{\infty} π_{k} (u - u_{0}),

where

π_{k} = {\frac{1}{k!} \frac{d^{k - 1}}{{dx}^{k - 1}} {{[ψ (x)]}^{k}} |}_{x = x_{0}} and ψ (x)

=

\frac{x - x_{0}}{G (x) - u_{0}} .

Then, we can write the GL qf as follows

G (x) = 1 - (1 + \frac{λ x}{1 + λ}) e^{- λ x} = u_{0} + x \sum_{i = 0}^{\infty} f_{i} x^{i},

where $u_{0} = 1$ and $f_{i} = {(- λ)}^{i + 1} [\frac{1}{(i + 1)!} - \frac{1}{(1 - λ) i!}] for i \geq 0$ .

Further, we have

ψ (x)

=

\frac{x - x_{0}}{G (x) - u_{0}} = \frac{1}{\sum_{i = 0}^{\infty} f_{i} x^{i}}

(22)

=

\frac{1}{λ (- 1 + \frac{1}{1 + λ})} \sum_{i = 0}^{\infty} ϱ_{i} x^{i} = (\frac{1 + λ}{λ^{2}}) \sum_{i = 0}^{\infty} {\bar{ϱ}}_{i} x^{i},

where ${\bar{ϱ}}_{0} = - 1$ , ${\bar{ϱ}}_{i} = - ϱ_{i}$ , $ϱ_{0} = 1$ and $ϱ_{i} = \frac{1}{f_{0}} \sum_{j = 1}^{\infty} f_{j} ϱ_{i - j}$ .

Thus, we obtain from equation (22)

{\frac{d^{k - 1}}{{dx}^{k - 1}} {{[ψ (x)]}^{k}} |}_{x = x_{0}} = \frac{ν_{k, k - 1} {(1 + λ)}^{k} (k - 1)!}{λ^{2 k}},

(23)

where $ν_{k, i} = {(k {\bar{ϱ}}_{0})}^{- 1} \sum_{m = 1}^{k} [m (i + 1) - k] {\bar{ϱ}}_{m} ν_{k, i - m}$ and $ν_{k, 0} = {\bar{ϱ}}_{0}^{i} = 1$ .

From equations (22) and (23), the quantity $π_{k}$ is given by

π_{k} = {\frac{1}{k!} \frac{d^{k - 1}}{{dx}^{k - 1}} {{[ψ (x)]}^{k}} |}_{x = x_{0}} = \frac{ν_{k, k - 1} {(1 + λ)}^{k}}{k λ^{2 k}} .

Hence, the Lindley qf reduces to

Q_{L} (u) = \sum_{k = 1}^{\infty} \frac{ν_{k, k - 1} {(1 + λ)}^{k}}{k λ^{2 k}} {(u - 1)}^{n} .

An alternative expression for $Q_{L} (x)$ is given by

Q_{L} (u) = \sum_{n = 0}^{\infty} t_{n} u^{n},

where $t_{n} = \sum_{k = n + 1}^{\infty} {(- 1)}^{k - n} (\frac{k}{n}) π_{k}$ .

Thus, we can obtain

Q_{GBGL} (u)

= \sum_{k = 0}^{\infty} t_{k} {[Q_{β (a, b)} (u)]}^{i / c},

where $t_{n} = \sum_{k = n + 1}^{\infty} {(- 1)}^{k - n} (\frac{k}{n}) π_{k}$ , $π_{k} = \frac{ν_{k, k - 1} {(1 + λ)}^{k}}{k λ^{2 k}}$ , $ν_{k, i} = {(k {\bar{ϱ}}_{0})}^{- 1} \sum_{m = 1}^{k} [m (i + 1) - k] {\bar{ϱ}}_{m} ν_{k, i - m}$ and $ν_{k, 0} = {\bar{ϱ}}_{0}^{i} = 1$ , ${\bar{ϱ}}_{0} = - 1$ , ${\bar{ϱ}}_{i} = - ϱ_{i}$ , $ϱ_{0} = 1$ and $ϱ_{i} = \frac{1}{f_{0}} \sum_{j = 1}^{\infty} f_{j} ϱ_{i - j}$ and $f_{i} = {(- λ)}^{i + 1} [\frac{1}{(i + 1)!} - \frac{1}{(1 - λ) i!}]$ .

The beta qf reduces to

Q_{β (a, b)} (u) = \sum_{j = 0}^{\infty} {\bar{θ}}_{j} u^{j / a},

where the transformed variable is $v = {[a β (a, b) u]}^{1 / a}$ , ${\bar{θ}}_{j} = θ_{j} {[a β (a, b) u]}^{1 / a}$ ,

θ_{j} = {\begin{matrix} 0, i f j = 0 \\ 1, i f j = 1 \\ γ_{j} i f j \geq 2 \end{matrix}

and

γ_{j} =

\frac{1}{[j^{2} + (a - 2) j + 1 - a]} {(1 - δ_{j, 2}) \sum_{r = 2}^{i - 1} γ_{r} γ_{j + 1 - r} [r (1 - a) (j - r)

- r (r - 1)] + \sum_{r = 1}^{j - 1} \sum_{s = 1}^{j - r} γ_{r} γ_{s} γ_{j + i - r - s} [r (r - a) + s (a + b - 2) (j + 1 - r - s)]},

where $δ_{j, 2} = 1$ if $i = 2$ and $δ_{j, 2} = 0$ if $i \neq 2$ . The first quantities are $γ_{2} = \frac{b - 1}{a - 1}$ , $γ_{3} = \frac{(b - 1) (3 a b + 5 b - 4)}{2 {(a + 1)}^{2} (a + 2)}$ , $γ_{4} = (b - 1) [a + (6 b - 1) a + (b + 2) (8 b - 5) a + (33 b^{2} - 30 b + 4) a + b (31 b - 47) + 18] / [3 {(a + 1)}^{3} (a + 2) (a + 3)], \dots$

For $z \in (0, 1)$ and any real non-integer $α$ , we have

z^{α} = \sum_{r = 0}^{\infty} s_{r} (α) z^{r},

where

s_{r} (α) = \sum_{l = r}^{\infty} {(- 1)}^{r + l} (\frac{α}{l}) (\frac{l}{r}) .

Finally, using (13), we obtain

Q_{GBGL} (u) = \sum_{j = 0}^{\infty} e_{j} u^{j / a},

where $e_{j} = \sum_{i, r = 0}^{\infty} t_{i} s_{r} (i / c) η_{r, j}$ , $η_{r, j} = {(j {\bar{θ}}_{0})}^{- 1} \sum_{m = 1}^{j} [m (r + 1) - j] {\bar{θ}}_{m} η_{r, j - m}$ and ${\bar{θ}}_{i}$ is given before.

Appendix E: Information Matrix

The elements of the unit observed information matrix $J (𝜽)$ for the parameters $(a, b, c, λ)$ are given by:

J_{a a} =

- [ψ^{'} (a) - ψ^{'} (a + b)], J_{a b} = ψ^{'} (a + b),

J_{a c} =

\log [1 - e^{- λ x} (1 + \frac{λ x}{1 + λ})],

J_{a λ}

= \frac{{x e^{- λ x} (1 + \frac{λ x}{1 + λ}) - e^{- λ x} [\frac{x}{1 + λ} - \frac{λ x}{{(1 + λ)}^{2}}]}}{1 - e^{- λ x} (1 + \frac{λ x}{1 + λ})},

J_{b b}

= - [ψ^{'} (b) - ψ^{'} (a + b)],

J_{b c}

= \frac{{[1 - e^{- λ x} (1 + \frac{λ x}{1 + λ})]}^{c} \log [1 - e^{- λ x} (1 + \frac{λ x}{1 + λ})]}{1 - {[1 - e^{- λ x} (1 + \frac{λ x}{1 + λ})]}^{c}},

J_{b λ}

=

- \frac{c [x e^{- λ x} (1 + \frac{λ x}{1 + λ}) - e^{- λ x} (\frac{x}{1 + λ} - \frac{λ x}{{(1 + λ)}^{2}})]}{1 - {[1 - e^{- λ x} (1 + \frac{λ x}{1 + λ})]}^{c}}

\times

{[1 - e^{- λ x} (1 + \frac{λ x}{1 + λ})]}^{c - 1},

J_{c c}

=

- \frac{(b - 1) {[1 - e^{- λ x (\frac{λ x}{λ + 1} + 1)}]}^{c} {[1 - e^{- λ x} (\frac{λ x}{λ + 1} + 1)]}^{c}}{{1 - {[1 - e^{- λ x (\frac{λ x}{λ + 1} + 1)}]}^{c}}^{2}}

\times

\log^{2} [1 - e^{- λ x (\frac{λ x}{λ + 1} + 1)}]

-

\frac{(b - 1) {[1 - e^{- λ x} (\frac{λ x}{λ + 1} + 1)]}^{c} \log [1 - e^{- λ x (\frac{λ x}{λ + 1} + 1)}]}{1 - {[1 - e^{- λ x (\frac{λ x}{λ + 1} + 1)}]}^{c}}

\times

\log [1 - e^{- λ x} (\frac{λ x}{λ + 1} + 1)] - \frac{1}{c^{2}},

J_{c λ}

=

- \frac{a e^{- λ x (\frac{λ x}{λ + 1} + 1)} [- λ x (\frac{x}{λ + 1} - \frac{λ x}{{(λ + 1)}^{2}}) - x (\frac{λ x}{λ + 1} + 1)]}{1 - e^{- λ x (\frac{λ x}{λ + 1} + 1)}}

+

\frac{(b - 1) e^{- λ x (\frac{λ x}{λ + 1} + 1)} [- λ x (\frac{x}{λ + 1} - \frac{λ x}{{(λ + 1)}^{2}}) - x (\frac{λ x}{λ + 1} + 1)]}{[1 - e^{- λ x (\frac{λ x}{λ + 1} + 1)}] {1 - {[1 - e^{- λ x (\frac{λ x}{λ + 1} + 1)}]}^{c}}}

\times

{[1 - e^{- λ x} (\frac{λ x}{λ + 1} + 1)]}^{c}

+

\frac{c (b - 1) e^{- λ x (\frac{λ x}{λ + 1} + 1)} [- λ x (\frac{x}{λ + 1} - \frac{λ x}{{(λ + 1)}^{2}}) - x (\frac{λ x}{λ + 1} + 1)]}{{1 - {[1 - e^{- λ x (\frac{λ x}{λ + 1} + 1)}]}^{c}}^{2}}

\times

{[1 - e^{- λ x} (\frac{λ x}{λ + 1} + 1)]}^{c} {[1 - e^{- λ x (\frac{λ x}{λ + 1} + 1)}]}^{c - 1} \log [1 - e^{- λ x (\frac{λ x}{λ + 1} + 1)}]

-

\frac{(b - 1) c [x e^{- λ x} (\frac{λ x}{λ + 1} + 1) - e^{- λ x} (\frac{x}{λ + 1} - \frac{λ x}{{(λ + 1)}^{2}})]}{1 - {[1 - e^{- λ x (\frac{λ x}{λ + 1} + 1)}]}^{c}}

\times

{[1 - e^{- λ x} (\frac{λ x}{λ + 1} + 1)]}^{c - 1} \log [1 - e^{- λ x (\frac{λ x}{λ + 1} + 1)}] .

Publication Dates

Publication in this collection
Jul-Sep 2017

History

Received
25 July 2016
Accepted
3 Apr 2017

All the contents of this journal, except where otherwise noted, is licensed under a Creative Commons Attribution License.

[1] ¹
AAS K AND HAFF IH. 2006. The generalized hyperbolic skew Student’s t-distribution. J Financial Econom 4: 275-309.

[2] ²
ALEXANDER C, CORDEIRO GM, ORTEGA EMM AND SARABIA JM. 2012. Generalized beta-generated distributions. Comput Stat Data Anal 56: 1880-1897.

[3] ³
ASHOUR SK AND ELTEHIWY MA. 2015. Exponentiated Power Lindley distribution. J Adv Res 6: 895-905.

[4] ⁴
BHATI D AND MALIK MA. 2015. On Lindley-exponential distribution: properties and application. Metron 73: 335-357.

[5] ⁵
BOWLEY AL. 1920. Elements of statistics. Scribner’s sons, New York.

[6] ⁶
CORDEIRO GM AND LEMONTE AJ. 2011. The β-birnbaum–Saunders distribution: An improved distribution for fatigue life modeling. Comput Stat Data Anal 55: 1445-1461.

[7] ⁷
EUGENE N, LEE C AND FAMOYE F. 2002. Beta-normal distribution and its applications. Commun Stat Theory Methods 31:497-512.

[8] ⁸
GHITANY M, ATIEH B AND NADARAJAH S. 2008. Lindley distribution and its applications. Math Comput Simul 78: 493-506.

[9] ⁹
GRADSHTEYN IS AND RYZHIK IM. 2000. Table of Integrals, Series, and Products. Academic Press, San Diego.

[10] ¹⁰
HANSEN BE. 1994. Autoregressive conditional density estimation. Int Econom Rev 35: 705-730.

[11] ¹¹
JODRÁ P. 2010. Computer generation of random variables with Lindley or Poisson-Lindley distribution via the Lambert W function. Math Comput Simul 81: 851-859.

[12] ¹²
JOHNSON NL, KOTZ S AND BALAKRISHNAN N. 1994. Continuous Univariate Distributions I. Wiley, New York.

[13] ¹³
JONES MC. 2004. Families of distributions arising from distributions of order statistics. Test 13: 1-43.

[14] ¹⁴
LEE ET AND WANG JW. 2003. Statistical methods for survival data analysis. J Wiley e Sons, New Jersey.

[15] ¹⁵
LINDLEY D. 1958. Fiducial distributions and Bayes’ theorem. J R Stat Soc Series B Stat Methodol 20: 102-107.

[16] ¹⁶
MANSOUR M AND MOHAMED S. 2015. A New Generalized of Transmuted Lindley Distribution. Appl Math Sci 55: 2729-2748.

[17] ¹⁷
MOORS JJA. 1984. A Quantile Alternative for Kurtosis. J R Stat Soc Ser D 37: 25-32.

[18] ¹⁸
MURTHY DNP, XIE M AND JIANG R. 2004. Weilbul Models. J Wiley e Sons, New Jersey.

[19] ¹⁹
NADARAJAH S, BAKOUCH HS AND TAHMASBI R. 2012. A generalized Lindley distribution. Sankhya Ser B 73: 331-359.

[20] ²⁰
PARARAI M, OLUYEDE BO AND WARAHENA-LIYANAGE G. 2015. Kumaraswamy Lindley-Poisson distribution: theory and applications. Asian J Math Appl 2015: 1-30.

[21] ²¹
SANKARAN M. 2015. The discrete Poisson-Lindley distribution. Biometrics 26: 145-149.

[22] ²²
ZHU D AND GALBRAITH JW. 2010. A generalized asymmetric Student-t distribution with application to financial econometrics. J Econom 157: 297-305.

$𝜽$	$n$	MLEs				MSEs
$𝜽$	$n$	a	b	c	$λ$	a	b	c	$λ$
(2,2,2,0.5)	50	2.018	2.031	2.014	0.502	0.060	0.061	0.045	0.001
	100	2.023	2.002	2.019	0.500	0.029	0.026	0.021	0.000
	150	2.012	2.002	2.010	0.500	0.019	0.019	0.014	0.000
(5,5,5,0.5)	50	5.011	5.042	5.007	0.501	0.177	0.183	0.107	0.000
	100	5.008	5.014	5.006	0.500	0.078	0.077	0.047	0.000
	150	5.006	5.013	5.004	0.500	0.060	0.059	0.036	0.000
(2,2,2,1)	50	2.030	2.023	2.024	1.002	0.061	0.062	0.046	0.003
	100	2.015	2.008	2.013	1.001	0.027	0.026	0.020	0.001
	150	2.014	2.000	2.012	0.999	0.019	0.019	0.014	0.001
(5,5,5,1)	50	5.011	5.040	5.007	1.001	0.171	0.174	0.103	0.001
	100	5.001	5.024	5.000	1.001	0.085	0.088	0.052	0.000
	150	5.010	5.009	5.007	1.000	0.060	0.058	0.036	0.000
(2,2,2,2)	50	2.040	2.015	2.034	2.001	0.060	0.062	0.045	0.014
	100	2.010	2.017	2.008	2.005	0.030	0.030	0.023	0.007
	150	2.002	2.015	2.001	2.006	0.019	0.020	0.014	0.005
(5,5,5,2)	50	5.016	5.027	5.012	2.001	0.172	0.168	0.104	0.002
	100	5.004	5.017	5.002	2.001	0.086	0.088	0.052	0.001
	150	5.002	5.018	5.001	2.001	0.057	0.057	0.034	0.001
(2,2,2,3)	50	2.018	2.032	2.014	3.017	0.061	0.058	0.045	0.034
	100	2.016	2.008	2.013	3.002	0.030	0.027	0.022	0.016
	150	2.018	1.998	2.015	2.996	0.019	0.018	0.014	0.011
(5,5,5,3)	50	4.986	5.066	4.987	3.009	0.171	0.176	0.103	0.006
	100	5.013	5.019	5.009	3.001	0.087	0.087	0.053	0.003
	150	5.000	5.020	5.000	3.003	0.059	0.058	0.035	0.002
(2,2,2,4)	50	2.020	2.027	2.016	4.017	0.056	0.059	0.042	0.062
	100	2.006	2.018	2.004	4.015	0.027	0.029	0.020	0.031
	150	2.009	2.008	2.008	4.005	0.019	0.019	0.014	0.021
(5,5,5,4)	50	5.016	5.044	5.010	4.006	0.169	0.179	0.102	0.011
	100	5.017	5.016	5.012	4.001	0.094	0.090	0.057	0.006
	150	5.019	4.994	5.014	3.997	0.060	0.059	0.036	0.004

$θ$	$n$	LSEs				MSEs
$θ$	$n$	a	b	c	$λ$	a	b	c	$λ$
(2,2,2,0.5)	50	1.972	1.996	1.969	0.491	0.049	0.011	0.061	0.003
	100	1.992	1.995	1.991	0.498	0.021	0.009	0.026	0.001
	150	1.987	1.982	1.983	0.498	0.017	0.008	0.020	0.001
(5,5,5,0.5)	50	5.011	5.011	5.013	0.501	0.046	0.032	0.063	0.000
	100	5.041	5.036	5.048	0.502	0.060	0.040	0.083	0.000
	150	4.936	4.945	4.923	0.497	0.030	0.021	0.042	0.000
(2,2,2,1)	50	2.002	1.999	2.002	0.994	0.066	0.032	0.081	0.013
	100	1.998	1.998	1.999	1.000	0.034	0.035	0.041	0.008
	150	1.996	2.004	1.997	0.998	0.021	0.008	0.026	0.004
(5,5,5,1)	50	4.998	5.007	4.998	0.999	0.114	0.077	0.156	0.002
	100	5.037	5.026	5.043	1.003	0.116	0.079	0.157	0.001
	150	4.990	4.993	4.988	0.999	0.023	0.014	0.032	0.000
(2,2,2,2)	50	1.986	2.025	2.000	1.980	0.069	0.061	0.082	0.045
	100	2.001	1.993	2.004	2.004	0.031	0.040	0.038	0.024
	150	1.998	2.002	2.002	1.993	0.021	0.021	0.025	0.016
(5,5,5,2)	50	4.950	4.972	4.940	1.989	0.196	0.120	0.274	0.011
	100	4.942	4.955	4.929	1.986	0.130	0.081	0.181	0.008
	150	4.980	4.986	4.975	1.996	0.053	0.030	0.074	0.003
(2,2,2,3)	50	1.993	2.000	2.003	2.993	0.068	0.087	0.077	0.078
	100	2.005	1.968	2.004	3.063	0.037	0.107	0.036	0.122
	150	1.998	1.994	2.002	3.008	0.019	0.043	0.023	0.031
(5,5,5,3)	50	4.893	4.910	4.867	2.960	0.287	0.190	0.403	0.039
	100	5.010	5.014	5.011	2.998	0.132	0.085	0.183	0.016
	150	4.940	4.951	4.927	2.976	0.074	0.051	0.106	0.009
(2,2,2,4)	50	2.004	1.995	2.007	4.018	0.062	0.120	0.074	0.072
	100	1.983	2.011	2.004	4.001	0.039	0.096	0.037	0.109
	150	1.997	1.998	1.999	4.001	0.019	0.040	0.023	0.024
(5,5,5,4)	50	4.949	4.949	4.933	3.967	0.323	0.257	0.454	0.060
	100	5.004	5.012	5.004	3.992	0.169	0.123	0.234	0.033
	150	4.944	4.943	4.930	3.972	0.118	0.089	0.165	0.025

Model	MLEs (SEs)
Model	a	b	c	$λ$	$H_{0} : G B G L \times L (a = b = c = 1)$
GBG-L	0.096 (0.014)	0.065 (0.020)	47.284 (1.383)	10.859 (0.384)	8.311 ( $\hat{α} = 0.040$ )
GBG-L	L	0.177 (0.007)	-	-	-

Model	MLEs (SEs)
L	$\hat{λ} = 0.19 (0.0123)$
LE	$\hat{α} = 1.55 (0.1647)$ , $\hat{λ} = 0.11 (0.0136)$
GL	$\hat{α} = 0.73 (0.0917)$ , $\hat{λ} = 0.16 (0.0165)$
GBGL	$\hat{λ} = 0.24 (0.0335)$ , $\hat{a} = 0.03 (0.0077)$ ,
GBGL	$\hat{b} = 0.31 (0.0807)$ , $\hat{c} = 34.95 (6.8937)$
TL	$\hat{λ} = - 0.38 (0.1494)$ , $\hat{θ} = 0.03 (0.0028)$ ,
TL	$\hat{δ} = 0.73 (0.1347)$ , $\hat{α} = 0.73 (0.1833)$

Model	Dependent on the pdf	Dependent on the cdf
Model	AIC	KS	p-value
L	814.3574	0.1075	0.1163
LE	802.3400	0.0575	0.8105
GL	809.6681	0.1982	$1.273 \times 10^{- 4}$
GBGL	801.9561	0.0282	0.9999
TL	813.6681	0.088731	0.2875

Brasil

Brasil

A new model for describing remission times: the generalized beta-generated Lindley distribution

Abstract

INTRODUCTION

THE GBGL DISTRIBUTION

LINEAR REPRESENTATIONS

QUANTILE FUNCTION

MOMENTS

GENERATING FUNCTION

ESTIMATION

MAXIMUM LIKELIHOOD ESTIMATION

LEAST SQUARE ESTIMATION

NUMERICAL RESULTS

SIMULATION STUDY

APPLICATIONS TO REAL DATA

CONCLUSIONS

ACKNOWLEDGMENTS

REFERENCES

Appendix A Proof of Theorem 1

Appendix B: Proof of Theorem 2

Appendix C: Proof of Theorem 3

Appendix D: Quantile function

Appendix E: Information Matrix

Publication Dates

History