Acessibilidade / Reportar erro

The Weibull Burr XII distribution in lifetime and income analysis

Abstract

We study a five-parameter model called the Weibull Burr XII (WBXII) distribution, which extends several models, including new ones. This model is quite flexible in terms of the hazard function, which exhibits increasing, decreasing, upside-down bathtub, and bathtub shapes. Its density function allows different forms such as left-skewed, right-skewed, reversed-J, and bimodal. We aim to provide some general mathematical quantities for the proposed distribution, which can be useful to real data analysis. We develop a shiny application to provide interactive illustrations of the WBXII density and hazard functions. We estimate the model parameters using maximum likelihood and derive a profile log-likelihood for all members of the Weibull-G family. The survival analysis application reveals that the WBXII model is suitable to accommodate left-skewed tails, which are very common when the variable of interest is the time to failure of a product. The income application is related to player salaries within a professional sports league and it is peculiar because the mean of the player’s salaries is much higher than for most professions. Both applications illustrate that the new distribution provides much better fits than other models with the same and less number of parameters.

Key words
Bimodal distribution; Burr XII distribution; profile log-likelihood; Weibull-G family

INTRODUCTION

For the Burr XII (BXII) distribution, also known as the Singh-Maddala distribution (Singh & Maddala 1975SINGH SK & MADDALA GS. 1975. A stochastic process for income distribution and tests for income distribution functions. In: ASA Proceedings of the Business and Economic Statistics Section, p. 551-553., 1976SINGH SK & MADDALA GS. 1976. A function for the size distribution of incomes. Econometrica 44: 963-970.) with shape parameters \(d>0\) and \(c>0\) and scale parameter \(s>0\), the cumulative distribution function (cdf) is

\begin{aligned} G(x;c,d,s)= 1-\left[1+\left(\frac{x}{s}\right)^{c}\right]^{-d},\qquad x>0.\end{aligned}(1)
For \(d=1\) and \(s=m^{-1}\), we have the log-logistic (LL) distribution and, for \(c=1\), it reduces to the Lomax distribution. The probability density function (pdf) corresponding to equation (1) is
\[g(x;c,d,s)= c\,d\,s^{-c}\,x^{c-1}\left[1+\left(\frac{x}{s}\right)^{c}\right]^{-d-1}.\](2)
This distribution is part of the Burr system of distributions (Burr 1942BURR IW. 1942. Cumulative frequency functions. Annals of Mathematical Statistics 13: 215-232.) and has extensive use in the context of income data. For recent examples, see Brzeziński 2013BRZEZIŃSKI M. 2013. Parametric Modelling of Income Distribution in Central and Eastern Europe. Central European Journal of Economic Modelling and Econometrics 5: 207-230., Jäntti & Jenkins 2010JÄNTTI M & JENKINS SP. 2010. The impact of macroeconomic conditions on income inequality. J Econ Inequal 8: 221-240., Tanak et al. 2015TANAK AK, BORZADARAN GM & AHMADI J. 2015. Entropy maximization under the constraints on the generalized Gini index and its application in modeling income distributions. Physica A 438: 657-666.. Cirillo 2010CIRILLO P. 2010. An analysis of the size distribution of Italian firms by age. Physica A 389: 459-466. also applied this model for analyzing the size distribution of Italian firms by age. Chotikapanich et al. 2013CHOTIKAPANICH D, GRIFFITHS W & RAO DSP. 2013. Calculating Poverty Measures from the Generalised Beta Income Distribution. The Economic Record 89: 46-66. considered it for calculating poverty measures in countries from South and Southeast Asia. Kumar et al. 2013KUMAR S, CHANDRA N & KHAN M. 2013. On the Reliability Estimation of Burr Type XII Distribution. Safety and Reliability 33: 29-40. adopted the BXII distribution on reliability context.

Bourguignon et al. 2014BOURGUIGNON M, SILVA RB & CORDEIRO GM. 2014. The Weibull-G Family of Probability Distributions. Journal of Data Science 12: 53-68. pioneered a family of univariate distributions generated by extending the Weibull (W) model applied to the odds ratio \(G(x)/[1-G(x)]\). For any baseline cdf \(G(x;\pmb{\xi})\), which depends on a parameter vector \(\pmb{\xi}\), they defined the Weibull-G family (for \(x \in \mathcal{D}\subseteq \mathbb{R}\)) by the pdf and cdf

\[f(x; \alpha,\beta,\pmb{\xi}) = \alpha\,\beta\, g(x;\pmb{\xi}) \frac{G(x;\pmb{\xi})^{\beta-1}}{1-G(x;\pmb{\xi})^{\beta+1}}\exp\left\{-\alpha\left[\frac{G(x;\pmb{\xi})}{1-G(x;\pmb{\xi})}\right]^{\beta}\right\}\](3)
and
\[F(x; \alpha,\beta,\pmb{\xi}) = \int_{0}^{\frac{G(x;\pmb{\xi})}{1-G(x;\pmb{\xi})}}\alpha\,\beta\,t^{\beta-1}\textrm{e}^{-\alpha \, t^{\beta}}\mathrm{d}t = 1 - \exp\left\{-\alpha\left[\frac{G(x;\pmb{\xi})}{1-G(x;\pmb{\xi})}\right]^{\beta}\right\},\](4)
respectively. The Weibull-G family has the same parameters of the G distribution plus two shape parameters \(\alpha>0\) and \(\beta > 0\). According to Bourguignon et al. 2014BOURGUIGNON M, SILVA RB & CORDEIRO GM. 2014. The Weibull-G Family of Probability Distributions. Journal of Data Science 12: 53-68., these additional parameters are sought as a manner to furnish more flexible distributions. If \(\beta=1\), it gives the exponential generator (Gupta et al. 1998GUPTA R, GUPTA P & GUPTA R. 1998. Modeling failure time data by Lehman alternatives. Communications in Statistics - Theory and Methods 27: 887-904. ). Cordeiro et al. 2015CORDEIRO GM, ORTEGA EMM, & RAMIRES TG. 2015. A new generalized Weibull family of distributions: mathematical properties and applications. J Stat Distrib Applic 2: 1-25. and Tahir et al. 2016TAHIR MH, ZUBAIR M, MANSOOR M, CORDEIRO GM, ALIZADEH M & HAMEDANI GG. 2016c. A new Weibull-G family of distributions. Hacettepe Journal of Mathematics and Statistics 45: 629-647. introduced another two types of Weibull-G families.

Following the formulation by Bourguignon et al. 2014BOURGUIGNON M, SILVA RB & CORDEIRO GM. 2014. The Weibull-G Family of Probability Distributions. Journal of Data Science 12: 53-68., we define the Weibull-Burr XII (WBXII) distribution and provide some of its mathematical quantities which were not addressed by Bourguignon et al. 2014BOURGUIGNON M, SILVA RB & CORDEIRO GM. 2014. The Weibull-G Family of Probability Distributions. Journal of Data Science 12: 53-68.. The new expressions can be helpful for those interested in applying this distribution to real life data.

In a similar approach, we can refer the reader to six other contributed works addressed to specific baselines of the Weibull-G family. These contributions are listed in Table I.

Table I
Contributed works on the Weibull-G family.

The WBXII distribution is obtained by inserting (1) and (2) in equations (3) and (4). Then, its pdf reduces to (for \(x>0\))

\[f(x) = \frac{\alpha\,\beta\,c\,d\,s^{-c}x^{c-1}}{\left[1+(x/s)^c\right]^{1-d}}\,\exp\left\{-\alpha\,[(1+(x/s)^c)^d-1]^{\beta}\right\}[(1+(x/s)^c)^d-1]^{\beta-1},\](5)
where \(\alpha > 0\), \(\beta>0\), \(d > 0\) and \(c > 0\) are shape parameters and \(s > 0\) is a scale parameter. The corresponding cdf is
\[F(x) = 1 - \exp\left\{-\alpha\,[(1+(x/s)^c)^d-1]^{\beta}\right\}.\](6)
Based on equation (6) we note that \(z(x)= [1+ (x/s)^c] ^d\) is the inverse of the BXII survival function, which is identifiable. Then, \(\alpha\, [z(x)- 1]^\beta\) is identifiable and so the WBXII cdf.

Two interpretations of (6) are now presented. First, let \(T\) be a BXII random variable with cdf (1) describing a real life phenomenon. If the random variable \(X\) represents the odds, the risk that this stochastic mechanism following the lifetime \(T\) will not occur at time \(x\) is given by \(G(x;c,d,s)/[1- G(x;c,d,s)]\). If we model the randomness \(X\) of these odds by the Weibull density with scale parameter \(\alpha>0\) and shape parameter \(\beta>0\), the cdf of \(X\) is given by (6). For the second interpretation, we take a WBXII random variable \(X\) and a random variable \(T\) with the Weibull density (for \(t >0\)) defined above. We can write \(P(X \leq x) = F(x) = 1 - \exp\left\{-\alpha\,[(1+(x/s)^c)^d-1]^{\beta}\right\}=P\left(T \leq G(x;c,d,s)/[1- G(x;c,d,s)]\right)\). Since the function \(\kappa(x) = G(x;c,d,s)/[1- G(x;c,d,s)]\) is always monotonic and non–decreasing, we obtain \(T=\kappa(X)\), where the equality of random variables refers to equivalence of distributions. So, if \(X\) has the WBXII cdf (6), then \(T = \kappa(X)\) has a Weibull cdf with the above parameters.

If \(X\) is a random variable with density function (5), we write \(X\sim\text{WBXII}(c,d,s,\alpha,\beta)\). The hazard rate function (hrf) of \(X\) reduces to

\[h(x) = \alpha\,\beta\,c\,d\,s^{-c}x^{c-1}\left[1+(x/s)^c\right]^{d-1}[(1+(x/s)^c)^d-1]^{\beta-1}.\]

The main contributions of this paper are described below:

  1. In the estimation section, we demonstrate that all members of the Weibull-G family present a semi-closed form for the maximum likelihood estimator (MLE) of \(\alpha\). Thus, the MLEs for any member of the Weibull-G family can also be determined from the profile log-likelihood function, which is much simpler.

  2. Bourguignon et al. 2014BOURGUIGNON M, SILVA RB & CORDEIRO GM. 2014. The Weibull-G Family of Probability Distributions. Journal of Data Science 12: 53-68. obtained general mathematical expressions for the Weibull-G family based on an infinite linear combination of exponentiated-G (exp-G) densities. We derive a new linear representation for the WBXII pdf in a simpler form based directly on the BXII model itself. Besides, we provide some important mathematical and statistical properties of the proposed distribution. These results are especially helpful for applications to real lifetime data.

  3. Equation (5) has different forms, including left-skewed, right-skewed, reversed-J, decreasing-increasing-decreasing and bimodal. Plots of the WBXII density function for selected parameter values are displayed in Figure 1. We develop a shiny application that allows the reader to access dynamic plots of the WBXII pdf and hrf1 https://newdists.shinyapps.io/WBXIIdist/. . From those plots, we note that the WBXII density presents bimodality, or the unusual decreasing-increasing-decreasing shape when \(\beta\) is very small, and the baseline shape parameters \(c\) and \(d\) are large.

  4. The proposed distribution overcomes a limitation of its baseline, whose hazard function presents only monotonic and unimodal shapes. The WBXII hrf admits the four main characteristics: decreasing, increasing, upside-down bathtub, and bathtub shaped. These are desirable properties for a lifetime distribution. Figure 2 provides plots of the hrf of \(X\) for selected parameter values.

  5. Equation (5) extends at least twenty lifetime distributions, including new ones. In fact, if we combine the Weibull and its two sub-models (exponential and Rayleigh) with seven special cases (Lomax, Fisk, log-logistic, Weibull, exponential and Rayleigh) of the Burr XII distribution including this distribution itself, the WBXII model can generate twenty descendants. Note that the first three models listed in Table 1 published in 2015 are just special cases of the new distribution. For \(\alpha=\beta=1\), the power generalized Weibull (PGW) (Dimitrakopoulou et al. 2007DIMITRAKOPOULOU T, ADAMIDIS K & LOUKAS S. 2007. A lifetime distribution with an upside-down bathtub-shaped hazard function. IEEE Transactions Reliability 121: 308-311., Nikulin & Haghighi 2006NIKULIN M & HAGHIGHI F. 2006. A chi-squared test for the generalized power Weibull family for the head-and-neck cancer censored data. J Math Sci 133: 1333-1341.) also arises as a special model.

  6. A major advantage of fitting a wider model to real data is that we can easily verify, based on the likelihood ratio (LR) statistics, whether its sub-models (with fewer parameters) can be more properly to the data.

  7. Equation (5) can be reduced to a four-parameter distribution by setting the scale parameter to one (Afify et al. 2018AFIFY AZ, CORDEIRO GM, ORTEGA EMM, YOUSOF HM & BUTT NS. 2018. The Four-Parameter Burr XII Distribution: Properties, Regression Model and Applications. Communications in Statistics - Theory and Methods 47: 2605-2624.) and then it becomes a very competitive model to all well-known four-parameter lifetime distributions such as the beta Weibull, Kumaraswamy Weibull, Kumaraswamy gamma, beta Dagum, among several others.

  8. Although the proposed model has five parameters, it can provide much better fits, based on Anderson-Darling and Kolmogorov-Smirnov statistics, than other models with the same and less number of parameters. This fact is proved empirically in applications to survival analysis and income distribution (see the application section). The survival analysis application illustrates the WBXII superiority to accommodate left-skewed tails, which are very common when the variable of interest is the time to failure of a product or component. The second data set is related to player salaries within a professional sports league. The salary distribution is typically heavy skewed to the right for several professions and also in the macro-economy. For the sports market, the salary distribution is no different from other occupations in terms of the shape. However, according to Rockerbie 2003ROCKERBIE DW. 2003. The economics of professional sports. Lethbridge, Alberta. , it is peculiar because their mean is much higher, and those markets are typically a natural monopoly. The WBXII model showed suitable to accommodate these feature as well.


Plots of the WBXII density with s = 1.

Plots of the WBXII hrf with s = 1.

An implementation in R language (R Core Team 2018R CORE TEAM. 2018. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria. URL https://www.R-project.org/.) used to obtain plots, application and simulation for the five-parameter Weibull Burr XII distribution is available in the footnote2 https://drive.google.com/open?id=1an1YbEeX1XHEbSvwIo8jbEAwkLxFJSxP. .

The remainder of the paper is organized as follows. Two useful expansions for the WBXII cdf and pdf are derived. We investigate some of its mathematical properties such as the quantile function (qf), ordinary and incomplete moments, mean deviations, and generating function. We determine the order statistics. The maximum likelihood method is used to estimate the model parameters . Two applications to real data sets are addressed in the application section. Finally, we offer some concluding remarks.

TWO USEFUL EXPANSIONS

Two useful expansions for equations (5) and (6) can be derived by using power series. It follows from Bourguignon et al. 2014BOURGUIGNON M, SILVA RB & CORDEIRO GM. 2014. The Weibull-G Family of Probability Distributions. Journal of Data Science 12: 53-68. that the Weibull-G density function can be expressed as

\[f(x; \alpha,\beta,\pmb{\xi}) = \sum\limits_{j,k=0}^{\infty}\rho_{j,k}\, g(x; \pmb{\xi})G(x; \pmb{\xi})^{(k+1)\beta + j - 1},\]
where
\[\rho_{j,k}= \frac{(-1)^k\, \beta\, \alpha^{k+1}\, \Gamma((k+1)\beta+j+1)}{k!\,j!\,\Gamma((k+1)\beta+1)},\]
and \(\Gamma(\cdot)\) is the gamma function. By replacing \(G(x; \pmb{\xi})\) for (1) and \(g(x; \pmb{\xi})\) for (2), we obtain
\[f(x; \alpha,\beta,\pmb{\xi})=c\,d\,s^{-c}\, \sum\limits_{j,k=0}^{\infty}\rho_{j,k}\,x^{c-1}u^{-d-1}\left(1-u^{-d}\right)^{(k+1)\beta + j - 1},\](7)
where \(u=1+\left(\frac{x}{s}\right)^{c}\). If \(|z|< 1\) and \(b>0\) is a real non-integer, the power series holds
\begin{aligned} \displaystyle (1-z)^{b-1}=\sum_{r=0}^{\infty}\,\frac{(-1)^{r} \Gamma(b)}{r!\,\Gamma(b-r)}\,z^{r}.\end{aligned}(8)
Using the above expansion for \(\big(1-u^{-d}\big)^{[(k+1)\beta + j - 1]-1}\) in equation (7) and, after some algebraic manipulation, we obtain
\[f(x)= \sum_{r=0}^\infty w_{r}\,g(x;c,(r+1)d,s),\](9)
where (for \(r=0,1,\ldots\))
\[w_{r}= \sum^\infty_{k,j=0}\frac{(-1)^{r}\,\rho_{j,k}\,\Gamma((k+1)\beta + j)}{\Gamma((k+1)\beta + j -r)(r+1)!}\](10)
and \(g(x;c,(r+1)d,s)\) is the BXII density function with scale parameter \(s\) and shape parameters \((r+1)d\) and \(c\). Equation (9) reveals that the WBXII density is an infinite linear combination of BXII densities. So, several structural properties of the WBXII distribution can be determined from those BXII properties. By integrating equation (9) gives
\[F(x)=\sum_{r=0}^\infty w_{r}\, G(x;c,(r+1)d,s).\](11)
Equations (9) and (11) are the main results of this section.

MATHEMATICAL PROPERTIES

In this section, we obtain some mathematical quantities of the WBXII distribution including quantile and random number generation, ordinary and incomplete moments, moment generating function (mgf), mean deviations and Bonferroni and Lorenz curves. By determining analytical expressions for those quantities can be more efficient than computing them directly by numerical integration of the density function ((5)).

Density and hazard shapes

The WBXII density and hazard functions are quite flexible as can be noted in Figures 1 and 2. They can take various forms depending on the shape parameters \(\alpha,\,\beta,\,c,\) and \(d\). In this section, we illustrate the exact behavior of these functions for some parameter sets by analyzing their limiting behavior and derivatives of their logarithms with respect to \(x\). In addition, we provide interactive plots that allow observing the behavior of these functions for several parameter combinations.

For the pdf (5), we note that

\[\lim_{x \to 0}f(x)= \begin{cases} \infty &amp; \text{if $\beta,c < 1$},\\ \alpha d s^{-1} &amp; \text{if $\beta=c= 1$},\\ 0 &amp; {\text{if $\beta,c > 1$},}\\ \end{cases}\]
and \(\lim\limits_{x \to \infty}f(x)=0.\)

Some calculations show that

\begin{aligned} \frac{\mathrm{d}\! \log f(x)}{\mathrm{d} x}=&amp;\, \frac{1}{x} \left\{c-1+\frac{c\,(x/s)^c}{1+(x/s)^c}\left[d-1 +\frac{d\,(\beta-1)\,[1+(x/s)^c]^{d}}{[1+(x/s)^c]^{d}-1}\right.\right.\nonumber\\ &amp;\left.\left.-\frac{\alpha\,\beta\,d\,[1+(x/s)^c]^{d}}{\{[1+(x/s)^c]^{d}-1\}^{1-\beta}}\right]\right\}.\end{aligned}(12)
The critical points of the WBXII pdf are the roots of the above equation, and numerical software is required to solve it. Nevertheless, from (12), we can verify that the WBXII density is decreasing if \(\beta\leq1,\,c\leq1,\) and \(d \leq 1\).

We can analyze the limiting behavior of the WBXII hrf for some parameter sets. We verify that

\[\lim_{x \to 0}h(x)= \begin{cases} \infty &amp; \text{if $\beta,c < 1$},\\ \alpha d s^{-1} &amp; \text{if $\beta=c= 1$},\\ 0 &amp; {\text{if $\beta,c > 1$},}\\ \end{cases}\quad % \end{equation*} \text{and}\quad % \begin{equation*} \lim_{x \to \infty}h(x)= \begin{cases} 0 &amp; \text{if $\beta,c,d < 1$},\\ \alpha s^{-1} &amp; \text{if $\beta=c=d = 1$},\\ \infty &amp; {\text{if $\beta,c,d > 1$}.}\\ \end{cases}\]
The critical point of \(h(x)\) are obtained from
\begin{aligned} \frac{\mathrm{d}\! \log h(x)}{\mathrm{d} x}= \frac{1}{x} \left\{c-1+\frac{c\,(x/s)^c}{1+(x/s)^c}\left[d-1 +\frac{d\,(\beta-1)\,[1+(x/s)^c]^{d}}{[1+(x/s)^c]^{d}-1}\right]\right\}=0. \end{aligned}
From the last equation, we can note that: i) if \(\beta<1,\,c<1\) and \(d<1\), then \(\mathrm{d}\!\log h(x)/\mathrm{d}x < 0\) and the hrf is decreasing; ii) if \(\beta=c=d = 1\), then \(\mathrm{d}\!\log h(x)/\mathrm{d}x = 0\) and the hrf is constant in \(\alpha/s\); and iii) if \(\beta>1,\,c>1\) and \(d>1\), then \(\mathrm{d}\!\log h(x)/\mathrm{d}x > 0\) and the hrf is increasing; iv) the parameter \(\alpha\) does not affect the hrf shapes; and v) numerical softwares are required to obtain the root of this equation.

Quantile function and random number generation

The qf of \(X\) follows by inverting (6) as

\[Q(u)=s\left\{\left[\left(\frac{-\log(1-u)}{\alpha}\right)^{1/\beta}+1\right]^{1/d}-1\right\}^{1/c}.\](13)
By setting \(u = 0.5\) in (13) gives the median \(M\) of \(X\). Different quantiles of interest can also be obtained from (13) by replacing appropriate values for \(u\).

If \(U\) is a uniform variate on the unit interval \((0,1)\), then the random variable \(X=Q(U)\) has pdf given by (5). Thus, the qf can be useful to generate observations from the WBXII distribution using the inverse transformation (see Section SIMULATION STUDY for an example). Another motivation to introduce this quantity is its applicability to obtain alternative expressions for the skewness and kurtosis. The Bowley’s skewness (Kenney & Keeping 1962KENNEY J & KEEPING E. 1962. Mathematics of Statistics. 3rd ed. New Jersey: Chapman and Hall Ltda. ) based on quartiles is

\[B=\frac{Q(3/4)-2Q(1/2)+Q(1/4)}{Q(3/4)-Q(1/4)}.\]
The Moors’ kurtosis (Moors 1988MOORS J. 1988. A quantile alternative for kurtosis. The Statistician 37: 25-32. ) based on octiles is
\[K_M=\frac{Q(7/8)-Q(5/8)-Q(3/8)+Q(1/8)}{Q(6/8)-Q(2/8)}.\]
These measures are less sensitive to outliers and may exist even for distributions without moments. These quantile-based measures can be obtained for the WBXII model from (13). See the next section for illustrative examples.

Moments

The \(n\)th ordinary moment of \(X\) can be determined from (9) as

\begin{aligned} \mu^{\prime}_n=E(X^n)=\sum_{r=0}^\infty w_{r} \int_0^\infty x^n\,g(x;d,(r+1)d,s) dx.\end{aligned}

Using a result in Zimmer et al. 1998ZIMMER WJ, KEATS JB & WANG FK. 1998. The Burr XII distribution in reliability analysis. J Qual Technol 30: 386-394., we have (for \(n<c\,d\))

\begin{aligned} \mu^{\prime}_n=s^{n}\,d\sum_{r=0}^\infty (r+1)w_{r} \,B((r+1)d-n\,c^{-1},1+n\,c^{-1}),\end{aligned}(14)
where \(B(a,b)\) is the beta function. The central moments (\(\mu_s\)), cumulants (\(\kappa_s\)) and the skewness and kurtosis of \(X\) can be evaluated from (14) using well-known relationships.

Table II provides a small numerical study by computing the first three moments and the \(B\) and \(K_M\) coefficients for six scenarios, each one with a different parametrization for the WBXII distribution. Figure 3 displays plots of \(B\) and \(K_M\) for some parameter values. In fact, they indicate that the proposed distribution is quite flexible in terms of variation of the moments, skewness and kurtosis. It can accommodate positive and negative values for both skewness and kurtosis coefficients.

Table II
First three moments and \(B\) and \(K_M\) for some scenarios of the WBXII distribution.

Skewness and kurtosis of X for some parameter values.

Incomplete moments

The \(h\)th incomplete moment of \(X\), say \(T_h(y)=\int_{0}^{y}\,x^{h}\,f(x)dx\), can be expressed as

\[T_h(y)= c\,d\,\sum_{r=0}^\infty\, (r+1)\,w_{r}\,\int_{0}^{y}x^{h-1}\,\left(\frac{x}{s}\right)^{c}\,\left[1+\left(\frac{x}{s}\right)^c\right]^{-(r+1)d-1}dx.\]
By setting \(t=\left[1+\left(\frac{x}{s}\right)^{c}\right]^{-1}\) in the last equation, we obtain
\begin{aligned} T_h(y)=d\,s^{h}\sum_{r=0}^\infty \,(r+1)\,w_{r}\,\int_{s^{c}/\left(s^{c}+y^{c}\right)}^{1}\,t^{(r+1)d-\frac{h}{c}-1}\,(1-t)^{\frac{h}{c}}dt.\end{aligned}
Hence, the \(h\)th incomplete moment of \(X\) reduces to (for \(h < c\,d\))
\begin{aligned}T_h(y)=d\,s^{h}\sum_{r=0}^\infty\,(r+1)\,w_{r}\,\,B_{s^{c}/s^{c}+y^{c}}((r+1)d-h\,c^{-1},1+h\,c^{-1}),\end{aligned}(15)
where \(B_{z}(a,b)=\int_z^1 t^{a-1}\,(1-t)^{b-1} dt\) is the upper incomplete beta function.

An important application of the first incomplete moment refers to the mean deviations about the mean and the median, namely

\[\delta_1=2\mu^{\prime}_1F(\mu^{\prime}_1)-2\,T_1(\mu^{\prime}_1) \qquad \text{and} \qquad \delta_2 = \mu^{\prime}_1-2\,T_1(M),\]
respectively, where \(F(\mu^{\prime}_1)\) is easily obtained from (6), \(\mu^{\prime}_1=E(X)\), the median \(M\) of \(X\) follows from (13) as \(M = Q(1/2)\), and (for \(c\,d>1\)) \(T_1(y)\) is the first incomplete moment given by (15) with \(h=1\). An alternative expression for \(T_1(\cdot)\) comes from (9) as
\begin{aligned} T_1(y)=c\, d\, s\,\sum_{r=0}^{\infty}(r+1)w_r \int_{0}^{y}x^c\left[1+\left(\frac{x}{s}\right)^c\right]^{-(r+1)d-1}\mathrm{d}x.\end{aligned}
Setting \(z=(x/s)^c\), we obtain
\begin{aligned} T_1(y)=d\,s\,\sum_{r=0}^{\infty}(r+1)w_r \int_{0}^{\left(\frac{y}{s}\right)^c}z^{1/c}(1+z)^{-(r+1)d-1}\mathrm{d}z.\end{aligned}
Thus,
\begin{aligned} T_1(y)=\frac{c\,d\,s\,y^{c+1}}{1+c}\sum_{r=0}^{\infty}(r+1)\,w_r\,_{2}F_1\left[1+\frac{1}{c},\,\,(r+1)d+1;\,\,2+\frac{1}{c};\,\,-\left(\frac{y}{s}\right)^c\right], \end{aligned}
where
\[_{2}F_1(a,b;c;x)=\sum_{k=0}^{\infty}\frac{(a)_k (b)_k}{(c)_k}\frac{x^k}{k!}\]
is the hypergeometric function and \((a)_k=a (a+1) (a+k-1)\) is the (rising) Pochhammer symbol if \(k>1\) and \((a)_0=1\).

The above results are related to the Bonferroni and Lorenz curves. These curves are important in economics for studying income and poverty but can be useful in demography, reliability, insurance, medicine, and several other fields. For a given probability \(\pi\), they are defined by \(B(\pi)=T_1(q)/ (\pi \mu^{\prime}_1)\) and \(L(\pi) = T_1(q)/\mu^{\prime}_1,\) respectively, where \(q = Q(\pi)\) is given by (13). If \(\pi\) is the proportion of units whose income is lower than or equal to \(q\), the values of \(L(\pi)\) yield fractions of the total income and \(B(\pi)\) refers to the relative income levels.

Generating function

The mgf of \(X\) is defined by \(M(t)=E(\mathrm{e}^{tX})\). Let \(M_{d}(t)\) be the mgf of the BXII\((c,d,s)\) distribution. We can write from (9)

\begin{aligned} M(t)=\sum_{r=0}^\infty w_{r}\,M_{(r+1)d}(t),\end{aligned}(16)
where \(M_{(r+1)d}(t)\) is the BXII\((s,(r+1)d,c)\) generating function and \(M_{d}(t)\) is given by
\begin{aligned} M_{d}(t)= cds^{-c}\int_{0}^\infty x^{c-1}\mathrm{e}^{tx}\left[1+\left(\frac{x}{s}\right)^c\right]^{-d-1}\mathrm{d}x.\end{aligned}(17)
Guerra et al. 2020GUERRA RR, PEÑA RAMÍREZ FA, PEÑA RAMÍREZ MR & CORDEIRO GM. 2020. A note on the density expansion and generating function of the beta Burr XII. Math Method Appl Sci 43: 1817-1824. considered the following expansion in the above equation
\begin{aligned} \left[1+\left(\frac{x}{s}\right)^c\right]^{-d-1}=\sum_{j=0}^\infty {-d-1\choose j}\left[\left(\frac{x}{s}\right)^{cj}\boldsymbol{1}_{(0,s]}\left(x\right)+\left(\frac{x}{s}\right)^{-c(j+d+1)}\boldsymbol{1}_{(s,\infty)}\left(x\right)\right],\end{aligned}(18)
where \(\boldsymbol{1}_{A}(x)\) denotes the indicator function over a given set of real numbers \(A\), i.e., \(\boldsymbol{1}_{A}(x)=1\) if \(x \in A\) and \(\boldsymbol{1}_{A}(x)=0\) elsewhere. Combining (17) and (18), and after some algebra, Guerra et al. 2020GUERRA RR, PEÑA RAMÍREZ FA, PEÑA RAMÍREZ MR & CORDEIRO GM. 2020. A note on the density expansion and generating function of the beta Burr XII. Math Method Appl Sci 43: 1817-1824. expressed the BXII mgf as an infinite sum of incomplete gamma functions given by (for \(t<0\))
\begin{aligned} M_d(t)=&amp;\,c\,d\,\sum_{j=0}^\infty{-d-1\choose j}\left[ (-s t)^{-(j+1) c}\,\gamma\left((j+1)c,-s t\right)\right.\nonumber\\ &amp;+\left. (-st)^{(d+j) c}\,\Gamma\left(-(d+j)c,-st\right)%E_{1+c(d+j)}\left(-st\right) \right],\end{aligned}(19)
where \(\gamma(a,z)=\int^z_0 t^{a-1}\,\mathrm{e}^{-t} \mathrm{d}t\) and \(\Gamma(a,z)=\int^\infty_z t^{a-1}\,\mathrm{e}^{-t} \mathrm{d}t\) are the lower and upper incomplete gamma functions, respectively. Hence, for \(t<0\), the mgf of \(X\) can be follows from (16) and (19) as a double summation
\begin{aligned} M(t)&amp;=&amp;c\,d\sum_{i=0}^\infty (r+1)w_{r}\, \sum_{j=0}^\infty{-(r+1)d-1\choose j}\left[ (-st)^{-(j+1)c}\,\gamma\left((j+1)c,-st\right)\right.\nonumber\\ &amp;+&amp;\left.(-st)^{c[(b+r)d+j]}\,\Gamma\left(-c\left[(r+1)d+j\right],-st\right)%E_{1+c[(b+i)d+j]}\left(-st\right) \right]. \end{aligned}(20)
Equation (20) is the main result of this section.

Stress-strength reliability

Let \(X_1\) be the life of a component with a random strength that is subjected to a random stress \(X_2\). We can define stress-strength reliability as \(R= P (X_2 < X_1 )=\int_0^\infty f_1(x)\,F_2(x)\text{d}x\), i.e., the component fails when the stress applied to it exceeds the strength (\(X_2 > X_1\)); otherwise, the component will function well. This measure is very useful in reliability.

Let \(X_1\) and \(X_2\) have independent WBXII\((c,d_1,s,\alpha_1,\beta_1)\) and WBXII\((c,d_2,s,\alpha_2,\beta_2)\) distributions, respectively, with the same shape parameter \(c\) and scale parameter \(s\). We can derive \(R\) using the results in (9) and (11). Note that the pdf of \(X_1\) and cdf of \(X_2\) can be expressed as

\[f_1(x)= \sum_{m=0}^\infty w_{m}\,g(x;c,(m+1)d_1,s)\,\,\,\,\text{and}\,\,\,\, F_2(x)= \sum_{n=0}^\infty w_{n}\,G(x;c,(n+1)d_2,s),\]
respectively, where \(w_m\) and \(w_n\) are given by (10). Thus, setting \(u=1+\left(\frac{x}{s}\right)^c\), we obtain
\begin{aligned} R=\sum_{m,n=0}^\infty\frac{(n+1)d_2\, w_m\,w_n\,}{(m+1)d_1+(n+1)d_2}.\end{aligned}

ORDER STATISTICS

Order statistics are important tools in many areas of statistical theory and practice. Let \(X_1,\cdots,X_n\) be a random sample of the Weibull-G family and \(X_{i:n}\) the \(i\)th order statistic. The density \(f_{i:n}(x)\) of \(X_{i:n}\) has the form

\begin{aligned} f_{i:n}(x)= \frac{1}{B(i,n-i+1)}\sum\limits_{j=0}^{n-i}(-1)^{j}\,{n-i \choose j}f(x)\,F(x)^{i+j-1}.\end{aligned}(21)
Setting \(u=1+(x/s)^c\) and using the power series in (8), we can rewrite \(F(x)^{i+j-1}\) as
\begin{aligned} \left\{1-\exp\left[-\alpha\,\frac{\left(1-u^{-d}\right)^\beta}{u^{-d\beta}}\right]\right\}^{i+j-1}=\sum_{k=0}^\infty \frac{(-1)^k\,(i+j-1)!}{k!\,(i+j-k-1)!}\exp\left\{-\alpha\,k\,\frac{\left(1-u^{-d}\right)^\beta}{u^{-d\beta}}\right\}.\end{aligned}
Inserting the above expansion in (21) and after some algebra, we obtain
\begin{aligned} f_{i:n}(x)=&amp;\, \frac{\alpha\,\beta\,c\,d\,s^{-c}x^{c-1}\,\,u^{-d-1}}{B(i,n-i+1)}\frac{\left(1-u^{-d}\right)^{\beta-1}}{u^{-d(\beta+1)}}\sum_{k=0}^\infty \sum\limits_{j=0}^{n-i}\,{n-i \choose j}\frac{(-1)^{j+k}\,(i+j-1)!}{k!\,(i+j-k-1)!}\\ &amp;\times\,\exp\left\{-\alpha\,(1+k)\,\frac{\left(1-u^{-d}\right)^\beta}{u^{-d\beta}}\right\}.\nonumber\end{aligned}(22)
By expanding the exponential function in the last equation, rewriting \((u^{-d})^\beta\) as \(\left[1-(1-u^{-d})\right]^\beta\), considering the power series given by (for \(|z| < 1\) and \(b> 0\) real non-integer)
\begin{aligned} (1-z)^{-b}=\sum_{j=0}^\infty\frac{\Gamma(b+j)}{j!\,\Gamma(b)}\end{aligned}
and inserting both expansions in equation (22), we obtain
\begin{aligned} f_{i:n}(x)=&amp;\, \frac{\beta\,c\,d\,s^{-c}x^{c-1}\,\,u^{-d-1}}{B(i,n-i+1)}\sum_{k,\,l,\,m=0}^\infty \sum\limits_{j=0}^{n-i}\,{n-i \choose j}\frac{(-1)^{j+k+l}\,\alpha^{l+1}\,(1+k)^l\,(i+j-1)!\,}{k!\,l!\,(i+j-k-1)!}\\ &amp;\times\,\frac{\Gamma((l+1)\beta+1+m)}{m!\,\Gamma((l+1)\beta+1)}\left(1-u^{-d}\right)^{(l+1)\beta+m-1}.\nonumber\end{aligned}
Finally, expanding \((1-u^{-d})^{(l+1)\beta+m-1}\) in the previous expression as in (8) and after some algebra, we can write
\[f_{i:n}(x)= \sum_{q=0}^\infty \upsilon_{q}\,g(x;c,(q+1)d,s),\](23)
where (for \(q=0,1,\ldots\))
\begin{aligned} \upsilon_{q}=&amp; \sum_{k,\,l,\,m=0}^\infty \sum\limits_{j=0}^{n-i}\,{n-i \choose j}\frac{(-1)^{j+k+l+q}\,\beta\,\alpha^{l+1}\,(1+k)^l\,(i+j-1)!}{k!\,l!\,m!\,(q+1)!\,B(i,n-i+1)\,(i+j-k-1)!}\\ &amp;\times\,\frac{\Gamma((l+1)\beta+1+m)\,\Gamma((l+1)\beta+m)}{\Gamma((l+1)\beta+1)\,\Gamma((l+1)\beta+m-q)}\end{aligned}
and \(g(x;c,(q+1)d,s)\) is the BXII density function with scale parameter \(s\) and shape parameters \((q+1)d\) and \(c\). Equation (23) is the main result of this section. Based on this linear representation, we can obtain some structural properties of \(X_{i:n}\) using a similar procedure as that one applied for the WBXII mathematical properties.

MAXIMUM LIKELIHOOD ESTIMATION

The maximum likelihood method is an important technique employed to estimate model parameters in distributions. Bourguignon et al. 2014BOURGUIGNON M, SILVA RB & CORDEIRO GM. 2014. The Weibull-G Family of Probability Distributions. Journal of Data Science 12: 53-68. determined the MLEs for the Weibull-G parameters from the total log-likelihood function. In this section, we demonstrate alternatively that the MLEs of theWeibull-G family can be determined based on the profile log-likelihood function. We also provide the MLEs for the WBXII distribution.

The Weibull-G profile log-likelihood

Let \(x_1,\cdots,x_n\) be observed values from the Weibull-G family with parameter vector \(\bm{\Theta}= (\alpha, \beta, \pmb{\xi}^\top)^\top\). As shown by Bourguignon et al. 2014BOURGUIGNON M, SILVA RB & CORDEIRO GM. 2014. The Weibull-G Family of Probability Distributions. Journal of Data Science 12: 53-68., the total log-likelihood function for \(\bm{\Theta}\) has the form

\begin{aligned} \ell(\bm{\Theta}) =&amp; n\,\log(\alpha\,\beta)-\alpha\sum\limits_{i=1}^{n}H(x_i; \pmb{\xi})^\beta + \beta\sum\limits_{i=1}^{n}\log[H(x_i; \pmb{\xi})]\\ &amp;- \sum\limits_{i=1}^{n}\log[G(x_i; \pmb{\xi})] - \sum\limits_{i=1}^{n}\log[1-G(x_i; \pmb{\xi})],\nonumber\end{aligned}(24)
where \(H(x;\pmb{\xi}) = G(x; \pmb{\xi})/\left[1-G(x; \pmb{\xi})\right]\). Thus, the first component of the score vector \(U(\bm{\Theta}) = \left(U_{\alpha}, U_{\beta}, U_{\pmb{\xi}}^\top\right)^\top\) is
\begin{aligned} U_{\alpha} &amp;= \frac{n}{\alpha} - \sum\limits_{i=1}^{n}H(x_i; \pmb{\xi})^\beta.\end{aligned}

For fixed \(\beta\) and \(\pmb{\xi}\), a semi-closed MLE for \(\alpha\) follows from \(U_{\alpha}=0\) as

\begin{aligned} \hat{\alpha}(\beta,\pmb{\xi}) &amp;= \frac{n}{\sum\limits_{i=1}^{n}H(x_i; \pmb{\xi})^\beta}.\end{aligned}
By replacing \(\alpha\) by \(\hat{\alpha}\) in (24), we obtain the Weibull-G profile log-likelihood for \(\pmb{\Theta_p}= (\beta, \pmb{\xi})^\top\) as
\begin{aligned} \ell(\pmb{\Theta_p}) = &amp;\,n\,\log(n\,\beta) + \beta\sum\limits_{i=1}^{n}\log[H(x_i; \pmb{\xi})] - \sum\limits_{i=1}^{n}\log[G(x_i; \pmb{\xi})] - \sum\limits_{i=1}^{n}\log[1-G(x_i; \pmb{\xi})]\\ &amp;-n\,\log\left[\sum\limits_{i=1}^{n}H(x_i; \pmb{\xi})^\beta\right] -n.\nonumber \end{aligned}(25)
Hence, the MLE \(\pmb{\widehat{\Theta}_p}\) of the parameter vector \(\pmb{\Theta_p}\) can be numerically found by maximizing (25) and the MLE of \(\alpha\) is just \(\hat{\alpha}(\pmb{\widehat{\Theta}_p})\). Note that the maximization of the profile log-likelihood might be simpler since it involves one less parameter.

The WBXII MLEs

Let \(x_1,\cdots, x_n\) be a random sample of size \(n\) from the WBXII\((c,d,s,\alpha,\beta)\) distribution. Let \(\pmb{\Theta} = (c,d,s,\alpha,\beta)^{T}\) be the parameter vector. The \(\log\)-likelihood function for \(\pmb{\Theta}\) follows as

\begin{aligned} \ell(\pmb{\Theta}) =&amp;\, n\log (\alpha \, \beta \, c \, d \, s^{-1} ) + (c-1)c^{-1}\sum_{i=1}^{n} \log (u_i-1) +(d-1)\sum_{i=1}^{n} \log u_i\\ &amp;-\alpha \, \sum_{i=1}^{n} \left(u_i^d-1\right)^\beta +(\beta - 1)\sum_{i=1}^{n} \log \left(u_i^d-1\right),\nonumber\end{aligned}(26)
where \(u_i=1+\left(\frac{x_i}{s}\right)^c\). The estimates of the model parameters can be obtained by maximizing (26).

Alternatively, we can be differentiating (26) and solving the resulting nonlinear likelihood equations. The components of the score vector \(\pmb{U(\Theta)}\) are

\begin{aligned} \displaystyle U_c(\pmb{\Theta}) =&amp;\, n\,c^{-1}+ c^{-1}\sum_{i=1}^{n} \log (u_i-1) +(d-1)\,c^{-1}\sum_{i=1}^{n} (u_i-1) \log \left(u_i-1\right) u_i^{-1} \\ &amp;- \alpha \, \beta \, d\,(c\,s)^{-1} \sum_{i=1}^{n} (u_i-1) \log \left(u_i-1\right) u_i^{d-1} \left(u_i^d-1\right)^{\beta -1}\\ &amp;+d \,(\beta -1)\,(c\,s)^{-1}\sum_{i=1}^{n} (u_i-1) \log \left(u_i-1\right) u_i^{d-1} \left(u_i^d-1\right)^{-1},\\ \displaystyle U_d(\pmb{\Theta}) =&amp;\, n\,d^{-1}+ \sum_{i=1}^{n}\log u_i +(\beta -1)\sum_{i=1}^{n} u_i^d \log u_i \left(u_i^d-1\right)^{-1}\\ &amp;- \alpha \beta \sum_{i=1}^{n} u_i^d \log u_i \left(u_i^d-1\right)^{\beta -1},\\ \displaystyle U_s(\pmb{\Theta}) =&amp;\, -c \, n \, s^{-1} + c(d-1)s^{-1} \sum_{i=1}^{n} (u_i -1) u_i^{-1} \\ &amp;- c \, d \, (\beta -1)\, s^{-1} \sum_{i=1}^{n}(u_i-1) u_i^{d-1} \left(u_i^d-1\right)^{-1}\\ &amp;+\alpha \, \beta \, c \, d \, s^{-1} \sum_{i=1}^{n}(u_i-1) u_i^{d-1} \left(u_i^d-1\right)^{\beta -1},\\ \displaystyle U_\alpha(\pmb{\Theta}) =&amp;\, n \, \alpha^{-1}-\sum_{i=1}^{n} \left( u_i^d-1\right)^\beta \end{aligned}
and
\begin{aligned} \displaystyle U_\beta(\pmb{\Theta}) &amp;= n \, \beta^{-1}+\sum_{i=1}^{n} \log \left( u_i^d-1\right)-\alpha\sum_{i=1}^{n} \left(u_i^d-1\right)^{\beta } \log \left(u_i^d-1\right).\end{aligned}
Setting these expressions to zero, \(\pmb{U(\Theta)}={\bf 0}\), and solving them simultaneously yields the MLEs of the five parameters. These equations cannot be solved analytically, but some statistical softwares can be used to solve them numerically using iterative methods such as the Newton-Raphson type algorithms.

For fixed \(c, d, s\) and \(\beta\), the MLE of \(\alpha\) is

\[ \hat{\alpha}(c,d,s,\beta)=\frac{n}{\sum_{i=1}^n (u_i^{d}-1)^\beta}.\](27)
By fixing \(x_1,\cdots,x_n\), it is easy to verify from (27) that

  • \(\hat \alpha \to 1\) when \(\beta \to 0^+\);

  • \(\hat \alpha \to \infty\) when \(s \to \infty\);

  • \(\hat \alpha \to 0^+\) when \(s \to 0^+\);

  • \(\hat \alpha \to 0^+\) when \(d \to \infty\);

  • \(\hat \alpha \to \infty\) when \(d \to 0^+\).

By replacing \(\alpha\) by (27) in equation (26) and letting \(\pmb{\Theta_p} = (c,d,s,\beta)\), the profile log-likelihood function for \(\pmb{\Theta_p}\) has the form

\begin{aligned} \ell(\pmb{\Theta_p}) &amp;= n\log (n \, \beta \, c \, d \, s^{-1} ) + (c-1)c^{-1}\sum_{i=1}^{n} \log (u_i-1) +(d-1)\sum_{i=1}^{n} \log u_i\\ &amp;- n \log \, \sum_{i=1}^{n} \left(u_i^d-1\right)^\beta +(\beta - 1)\sum_{i=1}^{n} \log \left(u_i^d-1\right) - n.\nonumber\end{aligned}28

The components of the score vector \(\pmb{U(\Theta_p)}\) of (28) are

\begin{aligned} \displaystyle U_c(\pmb{\Theta_p}) &amp;= n\,c^{-1}+ c^{-1}\sum_{i=1}^{n} \log (u_i-1) +(d-1)c^{-1}\sum_{i=1}^{n} \left(u_i-1\right)\log(u_i-1) u_i^{-1} \\ &amp;- n \, \beta \, d\,c^{-1}\left[ \sum_{i=1}^n \left(u_i^d-1\right)^\beta \right] ^{-1}\,\sum_{i=1}^{n} (u_i-1)u_i^{d-1} \left(u_i^d-1\right)^{\beta -1} \log \left(u_i-1\right)\\ &amp;+d \,(\beta -1)\,c^{-1}\sum_{i=1}^{n} (u_i-1) u_i^{d-1} \left(u_i^d-1\right)^{-1}\log \left(u_i-1\right),\\\end{aligned}
\begin{aligned} \displaystyle U_d(\pmb{\Theta_p}) &amp;= n\,d^{-1}+ \sum_{i=1}^{n}\log u_i +(\beta -1)\sum_{i=1}^{n} u_i^d \left(u_i^d-1\right)^{-1} \log u_i\\ &amp;- n \beta \left[{\sum_{i=1}^n \left(u_i^d-1\right)^\beta}\right]^{-1}\sum_{i=1}^{n} u_i^d \left(u_i^d-1\right)^{\beta -1}\log u_i,\\ \displaystyle U_s(\pmb{\Theta_p}) &amp;= - n \,c \, s^{-1} + c(d-1)s^{-1} \sum_{i=1}^{n} (u_i-1)u_i^{-1} \\ &amp;- c \, d \, (\beta -1)\, s^{-1} \sum_{i=1}^{n}(u_i-1)\, u_i^{d-1} \left(u_i^d-1\right)^{-1}\\ &amp;+n \, \beta \, c \, d \, s^{-1} \left[ {\sum_{i=1}^n \left(u_i^d-1\right)^\beta}\right]^{-1}\sum_{i=1}^{n}(u_i-1)\,u_i^{d-1} \left(u_i^d-1\right)^{\beta -1} \end{aligned}
and
\begin{aligned} \displaystyle U_\beta(\pmb{\Theta_p}) &amp;= n \, \beta^{-1}+\sum_{i=1}^{n} \log \left(u_i^d-1\right)-n\left[{\sum_{i=1}^n \left(u_i^d-1\right)^\beta}\right]^{-1}\sum_{i=1}^{n} \left(u_i^d-1\right)^{\beta } \log \left(u_i^d-1\right).\end{aligned}
Solving the equations \(\pmb{U(\Theta_p)}={\bf 0}\) simultaneously yields the MLEs of \(c\), \(d\), \(s\) and \(\beta\). The MLE of \(\alpha\) is just \(\hat{\alpha}(\hat{c},\hat{d},\hat{s},\hat{\beta})\). The maximization of the profile log-likelihood might be simpler since it involves only four parameters.

For interval estimation of the model parameters, we require the observed information matrix \(\pmb{J(\Theta)}\), whose elements can be obtained from the authors upon request. Under standard regularity conditions, the approximate confidence intervals for the model parameters can be constructed based on the multivariate normal \(N_{5}(0, \pmb{J(\widehat{\Theta})}^{-1})\) distribution.

A major advantage of fitting the proposed distribution to a real data set is that we can easily verify, based on the likelihood ratio (LR) statistics, whether any of its sub-models (with fewer parameters) can be preferred to these data.

The maximized (unrestricted and restricted) log-likelihoods are useful to compute LR statistics to verify if WBXII sub-models (with fewer parameters) can be preferred for fitting a given data set. This is a major advantage since the WBXII model extends at least twenty lifetime distributions, including new ones. Let \(\pmb{\Theta_{0}}\) be the restricted parameter vector for a given WBXII sub-model. Thus, hypothesis tests of the type \(H_{0}:\pmb{\Theta}=\pmb{\Theta}_{0}\) versus \(H:\pmb{\Theta} \neq \pmb{\Theta}_{0}\) can be performed using LR statistics. For example, the LR statistic for testing \(H_{0}:\alpha=\beta=1\) (versus \(H:H_{0}\,\,\text{is not true}\)), thus comparing the WBXII and PGW distributions, is

\[w=2\{l(\widehat{c},\widehat{d},\widehat{s},\widehat{\alpha},\widehat{\beta}) -l(\widetilde{c},\widetilde{d},\widetilde{s},1,1)\}\overset{d}{\to} \chi^2_2,\]
where \(\widehat{s}\), \(\widehat{d}\), \(\widehat{c}\), \(\widehat{\alpha}\), and \(\widehat{\beta}\) are the MLEs under \(H\), \(\widetilde{c}\), \(\widetilde{d},\) and \(\widetilde{s}\) are the estimates under \(H_{0}\) and \(\pmb{\Theta_{0}}=(c,d,s,1,1)^\top\).

SIMULATION STUDY

In this section, we evaluate the performance of the MLEs of the parameters of the WBXII distribution. We conduct Monte Carlo simulations based on 10,000 replications under five different parameter combinations and sample size \(n=\)100, 250 and 500. The simulation study is performed using the optim subroutine and SANN algorithm in R software for maximizing the log-likelihood in (26). Table III reports the empirical mean estimates and corresponding root mean squared errors (RMSEs). For all parameter combinations, we note that the empirical biases and RMSEs decrease when the sample size increases in agreement with the first-order asymptotic theory.

Table III
Mean estimates and RMSEs of the WBXII distribution.

APPLICATIONS

In this section, we illustrate the usefulness of the WBXII distribution for modeling income and lifetime data. The first data set represents the times to failure (\(10^3\) h) of 40 suits of turbochargers in one type of diesel engine (Xu et al. 2003XU K, XIE M, TANG LC & HO SL. 2003. Application of neural networks in forecasting engine systems reliability. Appl Soft Comput 2: 255-268.). These data were previously considered by Benkhelifa 2016BENKHELIFA L. 2016. The Weibull Birnbaum-Saunders Distribution: Properties and Applications. arXiv:1502.05180.. The second data set consists in annual salaries of 862 professional baseball players of the Major League Baseball for the season 2016. The data are measured in American dollars and are available for download at https://www.usatoday.com/sports/mlb/salaries/2016/player/all/. Both data sets are available in the appendix.

We use these two data sets to compare the fits of the WBXII distribution with other six related models, i.e., the beta Burr XII (BBXII), Kumaraswamy Burr XII (KwBXII), BXII, LL, PGW, and W distributions. The seven competitive models are defined as follows. The BBXII pdf is

\[f(x)=\frac{cd (x)^{c-1}}{s^{c}B(\alpha,\beta)}\Big\{1-\big[1+(x/s)^{c}\big]^{-d}\Big\} ^{\alpha-1}\big[1+(x/s)^{c}\big]^{-(d\beta+1)},\qquad x>0,\]
where \(\alpha>0\) and \(\beta>0\) are shape parameters; the KwBXII density is
\begin{aligned} f(x)&amp;=&amp;\alpha\,\beta\,c\,d\,s^{-c}x^{c-1}\left[1+\left(\frac{x}{s}\right)^{c}\right]^{-d-1} \left\{1-\left[1+\left(\frac{x}{s}\right)^{c}\right]^{-d}\right\}^{\alpha-1}\,\times\nonumber\\ &amp;&amp;\left[1-\left\{1-\left[1+\left(\frac{x}{s}\right)^{c}\right]^{-d}\right\}^{\alpha}\right]^{\beta-1},\,\,\,x>0,\end{aligned}
where \(\alpha>0\) and \(\beta>0\) are shape parameters; the BXII density is given by equation (2); the LL is defined by taking \(s=m^{-1}\) and \(d=1\) in (2); and the PGW and W distributions are sub-models of (5) by taking \(\alpha=\beta=1\) and \(\alpha=\beta=d=1\), respectively.

In each case, the parameters are estimated by maximum likelihood using the AdequacyModel script in the R software (Marinho et al. 2016MARINHO PRD, BOURGUIGNON M & DIAS CRB. 2016. AdequacyModel: Adequacy of Probabilistic Models and General Purpose Optimization. URL https://CRAN.R-project.org/package=AdequacyModel. R package version 2.0.0.
https://CRAN.R-project.org/package=Adequ...
). We report the MLEs and their corresponding standard errors. We present the following goodness-of-fit statistics: the Akaike information criteria (AIC), consistent Akaike information criteria (CAIC), Hannan-Quinn information criteria (HQIC), corrected Anderson-Darling statistic (\(A^*\)) (Chen & Balakrishnan 1995CHEN G & BALAKRISHNAN N. 1995. A general purpose approximate goodness-of-fit test. J Qual Technol 27: 154-161. ) and Kolmogorov-Smirnov (KS) statistic. The lower values of these statistics are associated with better fits. We also compute the LR statistics for testing WBXII sub-models.

Turbochargers failure time

Table IV provides some descriptive statistics of the turbochargers failure time data. Note that these data present negative skewness (S) and kurtosis (K) and have an amplitude of 7.4. We also have close values for the mean and median. This descriptive summary indicates that the turbochargers data follow a power-law distribution with a left-skewed tail.

Table IV
Descriptive statistics for turbochargers data.

Table V lists the MLEs for the fitted models to these data and their corresponding standard errors. For all fitted models, the parameter estimates are significant. Table VI gives the goodness-of-fit statistics. The WBXII distribution has the lowest values for all statistics. Note that the WBXII is quite competitive with the W distribution. However, the W model may not be an effective alternative for modeling left-skewed data. Table VII provides the LR statistics for the PGW and W fitted models. By considering a significance level of 10%, we may reject both sub-models in favor of the WBXII distribution. It is another clear evidence of the WBXII superiority for modeling these data. Figure 4 displays the histogram and the estimated densities with lower values for goodness-of-fit statistics. We note that the WBXII yields a good adjustment to the current data. In fact, the wider model is more accurate than the W distribution for modeling the right tail and is quite competitive with the BBXII distribution. Thus, we can conclude from Figure 4 and Tables VI and VII that the WBXII model provides the best fit to the turbocharges failure time data.

Table V
MLEs of the model parameters and their standard errors in parentheses.
Table VI
Goodness-of-fit statistics for the fits to the turbochargers failure time data.
Table VII
LR statistics for the fits to the turbochargers failure time data.

Histogram and estimated densities of the WBXII, BBXII and W models for the turbochargers failure time data.

Histogram and estimated densities of the WBXII, BBXII and KwBXII models for the baseball players data.

Baseball players salaries

Some descriptive statistics for the baseball players data are provided in Table VIII. These data present positive values for the S and K coefficients, thus indicating right-skew data. We have a high amplitude, variance, and SD. We also note that the mean and median are not so close. This behavior is quite common in income data sets.

Table VIII
Descriptive statistics for baseball players data.

Table IX provides the MLEs and their standard errors for the seven models fitted to the baseball players data. We have significant estimates for all parameters of these models. Table X lists some goodness-of-fit measures for the fitted models. The WBXII distribution presents the lowest values for all statistics. These results indicate that the WBXII distribution yields a better fit than the other fitted models to the baseball players data. The results for the LR tests are given in Table XI. Clearly, we reject the PGW and W distributions in favor of the wider model. So, there is a strong evidence of the potential need for the extra shape parameters of the WBXII in the second application. Figure 5 displays the fitted WBXII, BBXII and KwBXII densities and the histogram for the baseball players data. They confirm that the WBXII model yields the best fit. Finally, we can conclude that the WBXII is an effective alternative to modeling lifetime (see the first data set) and income (see the second data set) data, especially when they present power-law tails. It is quite competitive to the classical Weibull distribution and other BXII generalizations.

Table IX
MLEs of the model parameters and corresponding standard errors in parentheses.
Table X
Goodness-of-fit statistics for the fitted models for baseball players data.
Table XI
Likelihood ratio statistics for the fits to the baseball players data.

CONCLUDING REMARKS

The five-parameter Weibull Burr XII distribution is introduced and studied in detail. The proposed model extends at least twenty lifetime distributions, including new ones. Its hazard rate function can be increasing, decreasing, upside-down bathtub, and bathtub-shaped. It is also very flexible in terms of the density function, which has several forms including left-skewed, right-skewed, reversed-J, and bimodal. We emphasize that a shiny application is developed to provide interactive plots and illustrate the behavior of those functions for several parameter combinations. Some mathematical properties of the proposed model are presented, including the ordinary and incomplete moments, quantile and generating functions, mean deviations, stress-strength reliability, and order statistics. We estimate the model parameters using maximum likelihood, present the components of the score vector and the profile log-likelihood. We also derive a general result for the Weibull-G family, which presents a semi-closed form for the maximum likelihood estimator of the parameter \(\alpha\). We provide applications to real lifetime and income data sets. They illustrate the usefulness of the proposed distribution for modeling these kinds of data and also prove empirically that the WBXII distribution is quite competitive to other known Burr XII and Weibull generalizations.

REFERENCES

  • AFIFY AZ, YOUSOF HM, CORDEIRO GM, ORTEGA EMM & NOFAL ZM. 2016. The Weibull Fréchet distribution and its applications. Journal of Applied Statistics 43: 2608-2626.
  • AFIFY AZ, CORDEIRO GM, ORTEGA EMM, YOUSOF HM & BUTT NS. 2018. The Four-Parameter Burr XII Distribution: Properties, Regression Model and Applications. Communications in Statistics - Theory and Methods 47: 2605-2624.
  • BENKHELIFA L. 2016. The Weibull Birnbaum-Saunders Distribution: Properties and Applications. arXiv:1502.05180.
  • BOURGUIGNON M, SILVA RB & CORDEIRO GM. 2014. The Weibull-G Family of Probability Distributions. Journal of Data Science 12: 53-68.
  • BRZEZIŃSKI M. 2013. Parametric Modelling of Income Distribution in Central and Eastern Europe. Central European Journal of Economic Modelling and Econometrics 5: 207-230.
  • BURR IW. 1942. Cumulative frequency functions. Annals of Mathematical Statistics 13: 215-232.
  • CHEN G & BALAKRISHNAN N. 1995. A general purpose approximate goodness-of-fit test. J Qual Technol 27: 154-161.
  • CHOTIKAPANICH D, GRIFFITHS W & RAO DSP. 2013. Calculating Poverty Measures from the Generalised Beta Income Distribution. The Economic Record 89: 46-66.
  • CIRILLO P. 2010. An analysis of the size distribution of Italian firms by age. Physica A 389: 459-466.
  • CORDEIRO GM, ORTEGA EMM, & RAMIRES TG. 2015. A new generalized Weibull family of distributions: mathematical properties and applications. J Stat Distrib Applic 2: 1-25.
  • DIMITRAKOPOULOU T, ADAMIDIS K & LOUKAS S. 2007. A lifetime distribution with an upside-down bathtub-shaped hazard function. IEEE Transactions Reliability 121: 308-311.
  • GUERRA RR, PEÑA RAMÍREZ FA, PEÑA RAMÍREZ MR & CORDEIRO GM. 2020. A note on the density expansion and generating function of the beta Burr XII. Math Method Appl Sci 43: 1817-1824.
  • GUPTA R, GUPTA P & GUPTA R. 1998. Modeling failure time data by Lehman alternatives. Communications in Statistics - Theory and Methods 27: 887-904.
  • JÄNTTI M & JENKINS SP. 2010. The impact of macroeconomic conditions on income inequality. J Econ Inequal 8: 221-240.
  • KENNEY J & KEEPING E. 1962. Mathematics of Statistics. 3rd ed. New Jersey: Chapman and Hall Ltda.
  • KUMAR S, CHANDRA N & KHAN M. 2013. On the Reliability Estimation of Burr Type XII Distribution. Safety and Reliability 33: 29-40.
  • MARINHO PRD, BOURGUIGNON M & DIAS CRB. 2016. AdequacyModel: Adequacy of Probabilistic Models and General Purpose Optimization. URL https://CRAN.R-project.org/package=AdequacyModel R package version 2.0.0.
    » https://CRAN.R-project.org/package=AdequacyModel
  • MEROVCI F & ELBATAL I. 2015. Weibull Rayleigh Distribution: Theory and Applications. Appl Math Info Sci 4: 2127-2137.
  • MOORS J. 1988. A quantile alternative for kurtosis. The Statistician 37: 25-32.
  • NIKULIN M & HAGHIGHI F. 2006. A chi-squared test for the generalized power Weibull family for the head-and-neck cancer censored data. J Math Sci 133: 1333-1341.
  • OGUNTUNDE P, BALOGUN O, OKAGBUE H & BISHOP S. 2015. The Weibull-Exponential Distribution: Its Properties and Applications. J Appl Sci 15: 1305-1311.
  • R CORE TEAM. 2018. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria. URL https://www.R-project.org/.
  • ROCKERBIE DW. 2003. The economics of professional sports. Lethbridge, Alberta.
  • SINGH SK & MADDALA GS. 1975. A stochastic process for income distribution and tests for income distribution functions. In: ASA Proceedings of the Business and Economic Statistics Section, p. 551-553.
  • SINGH SK & MADDALA GS. 1976. A function for the size distribution of incomes. Econometrica 44: 963-970.
  • TAHIR MH, CORDEIRO GM, MANSOOR M & ZUBAIR M. 2015. The Weibull-Lomax distribution: properties and applications. Hacettepe Journal of Mathematics and Statistics 44: 455-474.
  • TAHIR MH, CORDEIRO GM, ALZAATREH A, MANSOOR M & ZUBAIR M. 2016a. A New Weibull-Pareto Distribution: Properties and Applications. Communications in Statistics - Simulation and Computation 45: 3548-3567.
  • TAHIR MH, CORDEIRO GM, MANSOOR M, ZUBAIR M & ALIZADEH M. 2016b. The Weibull-Dagum Distribution: Properties and Applications. Communications in Statistics - Theory and Methods 45: 3548-3567.
  • TAHIR MH, ZUBAIR M, MANSOOR M, CORDEIRO GM, ALIZADEH M & HAMEDANI GG. 2016c. A new Weibull-G family of distributions. Hacettepe Journal of Mathematics and Statistics 45: 629-647.
  • TANAK AK, BORZADARAN GM & AHMADI J. 2015. Entropy maximization under the constraints on the generalized Gini index and its application in modeling income distributions. Physica A 438: 657-666.
  • XU K, XIE M, TANG LC & HO SL. 2003. Application of neural networks in forecasting engine systems reliability. Appl Soft Comput 2: 255-268.
  • ZIMMER WJ, KEATS JB & WANG FK. 1998. The Burr XII distribution in reliability analysis. J Qual Technol 30: 386-394.

Appendix A

A.1. First data set Table A.I Turbochargers failure time data.
1.60 3.50 4.80 5.40 6.00 6.50 7.30 7.70 8.10 8.50
2.00 3.90 5.00 5.60 6.10 6.70 7.30 7.80 8.30 8.70
2.60 4.50 5.10 5.80 6.30 7.00 7.30 7.90 8.40 8.80
3.00 4.60 5.30 6.00 6.50 7.10 7.70 8.00 8.40 9.00
A.2. Second data set Table A.II Baseball players salaries data.
30714286 8400000 6500000 4000000 2275000 1050000 530500 518000 511200
34416666 16000000 6750000 4000000 5125000 1050000 530000 518000 511000
31000000 18000000 6725000 4000000 3000000 7081428 530000 517800 511000
29200000 12250000 7333333 4000000 2750000 1000000 530000 517700 511000
25714285 11666667 6575000 4000000 2175000 1000000 530000 517500 510500
25000000 25000000 8666666 4000000 2150000 1000000 530000 517500 510500
25000000 11333333 8750000 3750000 1800000 1000000 529600 517500 510500
24000000 10000000 6500000 3900000 2100000 1000000 529000 517500 510500
25833333 14325000 6250000 3900000 2100000 1000000 529000 517500 510200
25000000 13000000 6500000 3900000 2075000 1000000 528700 517300 510200
24400000 11500000 6425000 3900000 9083333 1000000 528600 517246 510120
24000000 10357142 5543750 3800000 3625000 1000000 528200 517000 510000
23777778 10000000 5500000 3750000 2500000 1000000 528000 517000 510000
25000000 11500000 6250000 3750000 2000000 1000000 527600 516700 510000
22500000 11000000 6250000 3583333 2000000 1000000 527500 516700 510000
23000000 10500000 5487500 3700000 2000000 1000000 527500 516650 510000
22000000 10000000 6225000 3125000 2000000 1000000 527000 516500 510000
24000000 11000000 6200000 3300000 2000000 1000000 527000 516500 510000
30000000 10976096 6170000 3333333 2000000 987500 527000 516500 510000
22125000 10936574 8750000 4000000 2000000 975000 526400 516100 510000
22142857 18000000 6125000 5125000 2000000 975000 526014 516100 510000
17666666 10700000 6125000 3500000 2000000 950000 525500 516100 509700
18000000 10650000 12500000 3500000 2000000 925000 525500 516000 509700
20000000 10550000 8285714 3500000 2000000 2666666 525500 516000 509675
23000000 14000000 6416666 5400000 2000000 900000 525300 515900 509600
21857142 10400000 6000000 2900000 2000000 900000 525270 515900 509500
27500000 9333333 6000000 3450000 2000000 900000 525000 515800 509500
22000000 12000000 6000000 3400000 2000000 897500 525000 515750 509500
18750000 5700000 6000000 3400000 2000000 895000 525000 515500 509500
21250000 10000000 7750000 3375000 1925000 850000 525000 515400 509500
18555555 7000000 6400000 5166666 1825000 850000 524900 515000 509500
20285714 9650000 5750000 3300000 1800000 810000 524525 515000 509500
15500000 9625000 5731704 3300000 1750000 807500 524500 515000 509500
17000000 10000000 5700000 3275000 1725000 800000 524500 515000 509500
20625000 10000000 7150000 10000000 1700000 800000 524500 515000 509500
22500000 11325000 5600000 3857142 1697500 800000 524100 515000 509500
15775000 11600000 5600000 3200000 1650000 765000 524100 515000 509500
18571429 6500000 5500000 3125000 1650000 725000 524000 515000 509300
23000000 9150000 5250000 3125000 1600000 660000 523900 514875 509200
19500000 25000000 5000000 3150000 1600000 652000 523700 514500 509000
17250000 9000000 4250000 3125000 1600000 650000 523500 514500 508900
11538461 5103900 5500000 3100000 2100000 650000 523500 514500 508800
18000000 8000000 4200000 3025000 1550000 625000 523400 514500 508800
22000000 10000000 6000000 5000000 1525000 607000 523000 514400 508750
17500000 9000000 5350000 3900000 1875000 606000 522900 514400 508600
17000000 8500000 5312000 3500000 1750000 600000 522700 514250 508600
17000000 7500000 5300000 3000000 1500000 600000 522500 514200 508500
18000000 11000000 5857142 3000000 1500000 587500 522500 514000 508500
19000000 8750000 6825000 3000000 1500000 575000 522500 514000 508500
21666666 8000000 5250000 3000000 1500000 575000 522400 513900 508500
11428571 8666666 1312500 3000000 1500000 574000 522300 513900 508500
17142857 4200380 5250000 3000000 1500000 570000 522000 513800 508500
17000000 8375000 2800000 2975000 1500000 570000 521800 513600 508500
16000000 9250000 5100000 2950000 1500000 566000 521700 513308 508500
14250000 6950000 7000000 2925000 1500000 563750 521600 513300 508500
24083333 6000000 10333333 2925000 1500000 556000 521300 513000 508500
15000000 8250000 5000000 4250000 1500000 550000 521300 513000 508500
16000000 7833333 4333333 2900000 1490314 550000 521200 513000 508450
16000000 14285714 5000000 2875000 1475000 550000 521100 513000 508200
15800000 12500000 3750000 3500000 1475000 548000 521000 512900 508200
15800000 10000000 5000000 2800000 1475000 547500 521000 512500 508000
15800000 9166666 5000000 2800000 1450000 546500 521000 512500 508000
13000000 7000000 5000000 2800000 1400000 546250 520700 512500 508000
14500000 9000000 5000000 2800000 1400000 545000 520500 512500 508000
15000000 8000000 5000000 2800000 1400000 545000 520500 512500 507500
16400000 8000000 4800000 4700000 1387500 545000 520500 512500 507500
15000000 8000000 4800000 3000000 1375000 543400 520300 512500 507500
12000000 8000000 4750000 2750000 1500000 542604 520200 512500 507500
16000000 8000000 7700000 2725000 1350000 542500 520200 512500 507500
14250000 8000000 5500000 2700000 1350000 541000 520000 512500 507500
15000000 8000000 5000000 2650000 3466666 540300 520000 512500 507500
16666666 8571428 4500000 2625000 1300000 540000 520000 512500 507500
15000000 6000000 4500000 2600000 1300000 539500 520000 512500 507500
12000000 7562500 4400000 2600000 1275000 539000 520000 512500 507500
14000000 14000000 4350000 2600000 1275000 539000 520000 512500 507500
14000000 11416666 4325000 2600000 1255000 539000 520000 512500 507500
13000000 7000000 4300000 3833333 5100000 537500 520000 512500 507500
13750000 7666666 5750000 2075000 1250000 537500 520000 512500 507500
13000000 7333333 4250000 2525000 1250000 537500 520000 512500 507500
8583333 7500000 4250000 4710739 1250000 536500 520000 512500 507500
12083333 6500000 4250000 2750000 1250000 536200 519700 512100 507500
11000000 6250000 4225000 2500000 1250000 535375 519500 512000 507500
13000000 7250000 4200000 2500000 1250000 535000 519500 512000 507500
13000000 7250000 4200000 2500000 1250000 535000 519400 512000 507500
12500000 7250000 4150000 2500000 1225000 535000 519300 511900 507500
13750000 6000000 4150000 2500000 4600000 535000 519200 511750 507500
13600000 7150000 4125000 2500000 1200000 535000 519100 511500 507500
8500000 8333333 2200000 2500000 1185000 534900 519000 511500 507500
16000000 6500000 4100000 2500000 1162500 533400 518500 511500 507500
13250000 7000000 4100000 2500000 1150000 532900 518500 511500 507500
12000000 7000000 10416666 2500000 975000 532500 518425 511500 507500
12500000 7000000 5918483 2425000 1100000 532500 518200 511400 507500
16875000 7000000 5000000 2400000 1065000 532500 518100 511360 507500
12500000 9250000 5000000 2375000 1050000 532000 518000 511250 507500
13333333 6166666 4000000 2350000 1050000 532000 518000 511200 507500
507500 507500 507500 507500 507500 507500 507500

Publication Dates

  • Publication in this collection
    04 June 2021
  • Date of issue
    2021

History

  • Received
    19 Aug 2019
  • Accepted
    8 Feb 2020
Academia Brasileira de Ciências Rua Anfilófio de Carvalho, 29, 3º andar, 20030-060 Rio de Janeiro RJ Brasil, Tel: +55 21 3907-8100 - Rio de Janeiro - RJ - Brazil
E-mail: aabc@abc.org.br