The Weibull Burr XII distribution in lifetime and income analysis

We study a five-parameter model called theWeibull Burr XII (WBXII) distribution, which extends several models, including new ones. This model is quite flexible in terms of the hazard function, which exhibits increasing, decreasing, upside-down bathtub, and bathtub shapes. Its density function allows different forms such as left-skewed, right-skewed, reversed-J, and bimodal. We aim to provide some general mathematical quantities for the proposed distribution, which can be useful to real data analysis. We develop a shiny application to provide interactive illustrations of the WBXII density and hazard functions. We estimate the model parameters using maximum likelihood and derive a profile log-likelihood for all members of the Weibull-G family. The survival analysis application reveals that the WBXII model is suitable to accommodate left-skewed tails, which are very common when the variable of interest is the time to failure of a product. The income application is related to player salaries within a professional sports league and it is peculiar because the mean of the player’s salaries is much higher than for most professions. Both applications illustrate that the new distribution provides much better fits than other models with the same and less number of parameters.


INTRODUCTION
For the Burr XII (BXII) distribution, also known as the Singh-Maddala distribution (Singh & Maddala 1975, 1976 with shape parameters d > 0 and c > 0 and scale parameter s > 0, the cumulative distribution function (cdf) is For d = 1 and s = m -1 , we have the log-logistic (LL) distribution and, for c = 1, it reduces to the Lomax distribution. The probability density function (pdf) corresponding to equation (1) is This distribution is part of the Burr system of distributions (Burr 1942) and has extensive use in the context of income data. For recent examples, see Jäntti & Jenkins (2010), Brzeziński (2013), Tanak et al. (2015). Cirillo (2010) also applied this model for analyzing the size distribution of Italian firms by age.

A NEW WBXII DISTRIBUTION
where α > 0, β > 0, d > 0 and c > 0 are shape parameters and s > 0 is a scale parameter. The corresponding cdf is Based on equation (6) we note that z(x) = [1 + (x/s) c ] d is the inverse of the BXII survival function, which is identifiable. Then, α [z(x) -1] β is identifiable and so the WBXII cdf. Two interpretations of (6) are now presented. First, let T be a BXII random variable with cdf (1) describing a real life phenomenon. If the random variable X represents the odds, the risk that this stochastic mechanism following the lifetime T will not occur at time x is given by G(x; c, d, s)/[1 -G(x; c, d, s)]. If we model the randomness X of these odds by the Weibull density with scale parameter α > 0 and shape parameter β > 0, the cdf of X is given by (6). For the second interpretation, we take a WBXII random variable X and a random variable T with the Weibull density (for t > 0) defined above. We can write P(X ≤ x) = F(x) = 1 -exp -α [(1 + (x/s) c ) d -1] β = P (T ≤ G(x; c, d, s)/[1 -G(x; c, d, s)]). Since the function κ(x) = G(x; c, d, s)/ [1 -G(x; c, d, s)] is always monotonic and non-decreasing, we obtain T = κ(X), where the equality of random variables refers to equivalence of distributions. So, if X has the WBXII cdf (6), then T = κ(X) has a Weibull cdf with the above parameters.
If X is a random variable with density function (5), we write X ∼ WBXII (c, d, s, α, β). The hazard rate function (hrf) of X reduces to The main contributions of this paper are described below: 1. In the estimation section, we demonstrate that all members of the Weibull-G family present a semi-closed form for the maximum likelihood estimator (MLE) of α. Thus, the MLEs for any member of the Weibull-G family can also be determined from the profile log-likelihood function, which is much simpler.
2. Bourguignon et al. (2014) obtained general mathematical expressions for the Weibull-G family based on an infinite linear combination of exponentiated-G (exp-G) densities. We derive a new linear representation for the WBXII pdf in a simpler form based directly on the BXII model itself. Besides, we provide some important mathematical and statistical properties of the proposed distribution. These results are especially helpful for applications to real lifetime data.
3. Equation (5) has different forms, including left-skewed, right-skewed, reversed-J, decreasing-increasing-decreasing and bimodal. Plots of the WBXII density function for selected parameter values are displayed in Figure 1. We develop a shiny application that allows the reader to access dynamic plots of the WBXII pdf and hrf 1 . From those plots, we note that the WBXII density presents bimodality, or the unusual decreasing-increasing-decreasing shape when β is very small, and the baseline shape parameters c and d are large.
4. The proposed distribution overcomes a limitation of its baseline, whose hazard function presents only monotonic and unimodal shapes. The WBXII hrf admits the four main characteristics: decreasing, increasing, upside-down bathtub, and bathtub shaped. These are desirable RENATA ROJAS GUERRA ET AL.

A NEW WBXII DISTRIBUTION
properties for a lifetime distribution. Figure 2 provides plots of the hrf of X for selected parameter values. (5)  8. Although the proposed model has five parameters, it can provide much better fits, based on Anderson-Darling and Kolmogorov-Smirnov statistics, than other models with the same and less number of parameters. This fact is proved empirically in applications to survival analysis and income distribution (see the application section). The survival analysis application illustrates the WBXII superiority to accommodate left-skewed tails, which are very common when the variable of interest is the time to failure of a product or component. The second data set is related to player salaries within a professional sports league. The salary distribution is typically heavy skewed to the right for several professions and also in the macro-economy. For the sports market, the salary distribution is no different from other occupations in terms of the shape. However, according to Rockerbie (2003), it is peculiar because their mean is much higher, and those markets are typically a natural monopoly. The WBXII model showed suitable to accommodate these feature as well.

Equation
An implementation in R language (R Core Team 2018) used to obtain plots, application and simulation for the five-parameter Weibull Burr XII distribution is available in the footnote 2 .
The remainder of the paper is organized as follows. Two useful expansions for the WBXII cdf and pdf are derived. We investigate some of its mathematical properties such as the quantile function (qf), ordinary and incomplete moments, mean deviations, and generating function. We determine the order statistics. The maximum likelihood method is used to estimate the model parameters . Two applications to real data sets are addressed in the application section. Finally, we offer some concluding remarks.

MATHEMATICAL PROPERTIES
In this section, we obtain some mathematical quantities of the WBXII distribution including quantile and random number generation, ordinary and incomplete moments, moment generating function (mgf), mean deviations and Bonferroni and Lorenz curves. By determining analytical expressions for those quantities can be more efficient than computing them directly by numerical integration of the density function (5).

Density and hazard shapes
The WBXII density and hazard functions are quite flexible as can be noted in Figures 1 and 2. They can take various forms depending on the shape parameters α, β, c, and d. In this section, we illustrate the exact behavior of these functions for some parameter sets by analyzing their limiting behavior and derivatives of their logarithms with respect to x. In addition, we provide interactive plots that allow observing the behavior of these functions for several parameter combinations. For the pdf (5), we note that The critical points of the WBXII pdf are the roots of the above equation, and numerical software is required to solve it. Nevertheless, from (12), we can verify that the WBXII density is decreasing if β ≤ 1, c ≤ 1, and d ≤ 1.
RENATA ROJAS GUERRA ET AL.

A NEW WBXII DISTRIBUTION
We can analyze the limiting behavior of the WBXII hrf for some parameter sets. We verify that From the last equation, we can note that: i) if β < 1, c < 1 and d < 1, then dlog h(x)/dx < 0 and the hrf is decreasing; ii) if β = c = d = 1, then dlog h(x)/dx = 0 and the hrf is constant in α/s; and iii) if β > 1, c > 1 and d > 1, then dlog h(x)/dx > 0 and the hrf is increasing; iv) the parameter α does not affect the hrf shapes; and v) numerical softwares are required to obtain the root of this equation.

Quantile function and random number generation
The qf of X follows by inverting (6) as By setting u = 0.5 in (13) gives the median M of X. Different quantiles of interest can also be obtained from (13) by replacing appropriate values for u. If U is a uniform variate on the unit interval (0, 1), then the random variable X = Q(U) has pdf given by (5). Thus, the qf can be useful to generate observations from the WBXII distribution using the inverse transformation (see Section SIMULATION STUDY for an example). Another motivation to introduce this quantity is its applicability to obtain alternative expressions for the skewness and kurtosis. The Bowley's skewness (Kenney & Keeping 1962) based on quartiles is

The Moors' kurtosis (Moors 1988) based on octiles is
These measures are less sensitive to outliers and may exist even for distributions without moments. These quantile-based measures can be obtained for the WBXII model from (13). See the next section for illustrative examples.

Moments
The nth ordinary moment of X can be determined from (9) as where B(a, b) is the beta function. The central moments (μ s ), cumulants (κ s ) and the skewness and kurtosis of X can be evaluated from (14) using well-known relationships. Table II provides a small numerical study by computing the first three moments and the B and K M coefficients for six scenarios, each one with a different parametrization for the WBXII distribution. Figure 3 displays plots of B and K M for some parameter values. In fact, they indicate that the proposed distribution is quite flexible in terms of variation of the moments, skewness and kurtosis. It can accommodate positive and negative values for both skewness and kurtosis coefficients.

Incomplete moments
The hth incomplete moment of X, say T h (y) = y 0 x h f (x)dx, can be expressed as dx.
Hence, the hth incomplete moment of X reduces to (for h < c d) where B z (a, b) = 1 z t a-1 (1 -t) b-1 dt is the upper incomplete beta function. An important application of the first incomplete moment refers to the mean deviations about the mean and the median, namely and A NEW WBXII DISTRIBUTION respectively, where F(μ 1 ) is easily obtained from (6), μ 1 = E(X), the median M of X follows from (13) as M = Q(1/2), and (for c d > 1) T 1 (y) is the first incomplete moment given by (15) with h = 1. An alternative expression for T 1 (·) comes from (9) as dx. Thus, The above results are related to the Bonferroni and Lorenz curves. These curves are important in economics for studying income and poverty but can be useful in demography, reliability, insurance, medicine, and several other fields. For a given probability π, they are defined by B(π) = T 1 (q)/(πμ 1 ) and L(π) = T 1 (q)/μ 1 , respectively, where q = Q(π) is given by (13). If π is the proportion of units whose income is lower than or equal to q, the values of L(π) yield fractions of the total income and B(π) refers to the relative income levels.

Generating function
The mgf of X is defined by M(t) = E(e tX ). Let M d (t) be the mgf of the BXII(c, d, s) distribution. We can write from (9) where M (r+1)d (t) is the BXII(s, (r + 1)d, c) generating function and M d (t) is given by Guerra et al. (2020) considered the following expansion in the above equation where 1 A (x) denotes the indicator function over a given set of real numbers A, i.e., 1 A (x) = 1 if x ∈ A and 1 A (x) = 0 elsewhere. Combining (17) and (18), and after some algebra, Guerra et al. (2020) expressed the BXII mgf as an infinite sum of incomplete gamma functions given by (for t < 0) where γ(a, z) = z 0 t a-1 e -t dt and Γ(a, z) = ∞ z t a-1 e -t dt are the lower and upper incomplete gamma functions, respectively. Hence, for t < 0, the mgf of X can be follows from (16) and (19) as a double summation Equation (20) is the main result of this section.

Stress-strength reliability
Let X 1 be the life of a component with a random strength that is subjected to a random stress X 2 . We can define stress-strength reliability as R = P(X 2 < X 1 ) = ∞ 0 f 1 (x) F 2 (x)dx, i.e., the component fails when the stress applied to it exceeds the strength (X 2 > X 1 ); otherwise, the component will function well. This measure is very useful in reliability.
Let X 1 and X 2 have independent WBXII(c, d 1 , s, α 1 , β 1 ) and WBXII(c, d 2 , s, α 2 , β 2 ) distributions, respectively, with the same shape parameter c and scale parameter s. We can derive R using the results in (9) and (11). Note that the pdf of X 1 and cdf of X 2 can be expressed as (n + 1)d 2 w m w n (m + 1)d 1 + (n + 1)d 2 .

ORDER STATISTICS
Order statistics are important tools in many areas of statistical theory and practice. Let X 1 , · · · , X n be a random sample of the Weibull-G family and X i:n the ith order statistic. The density f i:n (x) of X i:n has the form RENATA ROJAS GUERRA ET AL.

A NEW WBXII DISTRIBUTION
Setting u = 1 + (x/s) c and using the power series in (8), we can rewrite F(x) i+j-1 as Inserting the above expansion in (21) and after some algebra, we obtain

By expanding the exponential function in the last equation, rewriting
considering the power series given by (for |z| < 1 and b > 0 real non-integer) and inserting both expansions in equation (22), we obtain Finally, expanding (1 -u -d ) (l+1)β+m-1 in the previous expression as in (8) and after some algebra, we can write where (for q = 0, 1, . . .) Γ((l + 1)β + 1 + m) Γ((l + 1)β + m) Γ((l + 1)β + 1) Γ((l + 1)β + m -q) and g(x; c, (q+1)d, s) is the BXII density function with scale parameter s and shape parameters (q+1)d and c. Equation (23) is the main result of this section. Based on this linear representation, we can obtain some structural properties of X i:n using a similar procedure as that one applied for the WBXII mathematical properties.

RENATA ROJAS GUERRA ET AL.
A NEW WBXII DISTRIBUTION

MAXIMUM LIKELIHOOD ESTIMATION
The maximum likelihood method is an important technique employed to estimate model parameters in distributions. Bourguignon et al. (2014) determined the MLEs for the Weibull-G parameters from the total log-likelihood function. In this section, we demonstrate alternatively that the MLEs of theWeibull-G family can be determined based on the profile log-likelihood function. We also provide the MLEs for the WBXII distribution.

The Weibull-G profile log-likelihood
Let x 1 , · · · , x n be observed values from the Weibull-G family with parameter vector Θ = (α, β, ξ ξ ξ ) . As shown by Bourguignon et al. (2014), the total log-likelihood function for Θ has the form For fixed β and ξ ξ ξ, a semi-closed MLE for α follows from U α = 0 aŝ By replacing α byα in (24), we obtain the Weibull-G profile log-likelihood for Θ p Θ p Θ p = (β, ξ ξ ξ) as Hence, the MLE Θ p Θ p Θ p of the parameter vector Θ p Θ p Θ p can be numerically found by maximizing (25) and the MLE of α is justα( Θ p Θ p Θ p ). Note that the maximization of the profile log-likelihood might be simpler since it involves one less parameter.

RENATA ROJAS GUERRA ET AL.
A NEW WBXII DISTRIBUTION

The WBXII MLEs
Let x 1 , · · · , x n be a random sample of size n from the WBXII(c, d, s, α, β) distribution. Let Θ Θ Θ = (c, d, s, α, β) T be the parameter vector. The log-likelihood function for Θ Θ Θ follows as The estimates of the model parameters can be obtained by maximizing (26).
Alternatively, we can be differentiating (26) and solving the resulting nonlinear likelihood equations. The components of the score vector U(Θ) U(Θ) U(Θ) are Setting these expressions to zero, U(Θ) U(Θ) U(Θ) = 0, and solving them simultaneously yields the MLEs of the five parameters. These equations cannot be solved analytically, but some statistical softwares can be used to solve them numerically using iterative methods such as the Newton-Raphson type algorithms.
RENATA ROJAS GUERRA ET AL.

A NEW WBXII DISTRIBUTION
For fixed c, d, s and β, the MLE of α iŝ By fixing x 1 , · · · , x n , it is easy to verify from (27) that By replacing α by (27) in equation (26) and letting Θ p Θ p Θ p = (c, d, s, β), the profile log-likelihood function for Θ p Θ p Θ p has the form The components of the score vector U(Θ p ) Solving the equations U(Θ p ) U(Θ p ) U(Θ p ) = 0 simultaneously yields the MLEs of c, d, s and β. The MLE of α is just α(ĉ,d,ŝ,β). The maximization of the profile log-likelihood might be simpler since it involves only four parameters.
For interval estimation of the model parameters, we require the observed information matrix J(Θ) J(Θ) J(Θ), whose elements can be obtained from the authors upon request. Under standard regularity conditions, the approximate confidence intervals for the model parameters can be constructed based on the multivariate normal N 5 (0, J( Θ) J( Θ) J( Θ) -1 ) distribution.
A major advantage of fitting the proposed distribution to a real data set is that we can easily verify, based on the likelihood ratio (LR) statistics, whether any of its sub-models (with fewer parameters) can be preferred to these data.
The maximized (unrestricted and restricted) log-likelihoods are useful to compute LR statistics to verify if WBXII sub-models (with fewer parameters) can be preferred for fitting a given data set. This is a major advantage since the WBXII model extends at least twenty lifetime distributions, including new ones. Let Θ 0 Θ 0 Θ 0 be the restricted parameter vector for a given WBXII sub-model. Thus, hypothesis tests of the type H 0 : Θ Θ Θ = Θ Θ Θ 0 versus H : Θ Θ Θ = Θ Θ Θ 0 can be performed using LR statistics. For example, the LR statistic for testing H 0 : α = β = 1 (versus H : H 0 is not true), thus comparing the WBXII and PGW distributions, is where s, d, c, α, and β are the MLEs under H, c, d, and s are the estimates under H 0 and Θ 0 Θ 0 Θ 0 = (c, d, s, 1, 1) .

RENATA ROJAS GUERRA ET AL.
A NEW WBXII DISTRIBUTION

SIMULATION STUDY
In this section, we evaluate the performance of the MLEs of the parameters of the WBXII distribution. We conduct Monte Carlo simulations based on 10,000 replications under five different parameter combinations and sample size n =100, 250 and 500. The simulation study is performed using the optim subroutine and SANN algorithm in R software for maximizing the log-likelihood in (26). Table III reports the empirical mean estimates and corresponding root mean squared errors (RMSEs). For all parameter combinations, we note that the empirical biases and RMSEs decrease when the sample size increases in agreement with the first-order asymptotic theory.

APPLICATIONS
In this section, we illustrate the usefulness of the WBXII distribution for modeling income and lifetime data. The first data set represents the times to failure (10 3 h) of 40 suits of turbochargers in one type of diesel engine (Xu et al. 2003). These data were previously considered by Benkhelifa (2016). The second data set consists in annual salaries of 862 professional baseball players of the Major League Baseball for the season 2016. The data are measured in American dollars and are available for download at https://www.usatoday.com/sports/mlb/salaries/2016/player/all/. Both data sets are available in the appendix.
RENATA ROJAS GUERRA ET AL.

A NEW WBXII DISTRIBUTION
We use these two data sets to compare the fits of the WBXII distribution with other six related models, i.e., the beta Burr XII (BBXII), Kumaraswamy Burr XII (KwBXII), BXII, LL, PGW, and W distributions. The seven competitive models are defined as follows. The BBXII pdf is where α > 0 and β > 0 are shape parameters; the KwBXII density is where α > 0 and β > 0 are shape parameters; the BXII density is given by equation (2); the LL is defined by taking s = m -1 and d = 1 in (2); and the PGW and W distributions are sub-models of (5) by taking α = β = 1 and α = β = d = 1, respectively. In each case, the parameters are estimated by maximum likelihood using the AdequacyModel script in the R software (Marinho et al. 2016). We report the MLEs and their corresponding standard errors. We present the following goodness-of-fit statistics: the Akaike information criteria (AIC), consistent Akaike information criteria (CAIC), Hannan-Quinn information criteria (HQIC), corrected Anderson-Darling statistic (A * ) (Chen & Balakrishnan 1995) and Kolmogorov-Smirnov (KS) statistic.
The lower values of these statistics are associated with better fits. We also compute the LR statistics for testing WBXII sub-models. Table IV provides some descriptive statistics of the turbochargers failure time data. Note that these data present negative skewness (S) and kurtosis (K) and have an amplitude of 7.4. We also have close values for the mean and median. This descriptive summary indicates that the turbochargers data follow a power-law distribution with a left-skewed tail.  Table V lists the MLEs for the fitted models to these data and their corresponding standard errors. For all fitted models, the parameter estimates are significant. Table VI gives the goodness-of-fit statistics. The WBXII distribution has the lowest values for all statistics. Note that the WBXII is quite competitive with the W distribution. However, the W model may not be an effective alternative for modeling left-skewed data. Table VII provides the LR statistics for the PGW and W fitted models. By considering a significance level of 10%, we may reject both sub-models in favor of the WBXII distribution. It is another clear evidence of the WBXII superiority for modeling these data. Figure 4 RENATA ROJAS GUERRA ET AL.

Turbochargers failure time
A NEW WBXII DISTRIBUTION displays the histogram and the estimated densities with lower values for goodness-of-fit statistics. We note that the WBXII yields a good adjustment to the current data. In fact, the wider model is more accurate than the W distribution for modeling the right tail and is quite competitive with the BBXII distribution. Thus, we can conclude from Figure 4 and Tables VI and VII that the WBXII model provides the best fit to the turbocharges failure time data.

Baseball players salaries
Some descriptive statistics for the baseball players data are provided in Table VIII. These data present positive values for the S and K coefficients, thus indicating right-skew data. We have a high amplitude, variance, and SD. We also note that the mean and median are not so close. This behavior is quite common in income data sets. Table IX provides the MLEs and their standard errors for the seven models fitted to the baseball players data. We have significant estimates for all parameters of these models. Table X lists some goodness-of-fit measures for the fitted models. The WBXII distribution presents the lowest values for all statistics. These results indicate that the WBXII distribution yields a better fit than the other fitted models to the baseball players data. The results for the LR tests are given in Table XI. Clearly, we reject the PGW and W distributions in favor of the wider model. So, there is a strong evidence of the potential need for the extra shape parameters of the WBXII in the second application. Figure 5 displays the fitted WBXII, BBXII and KwBXII densities and the histogram for the baseball players data. They confirm that  the WBXII model yields the best fit. Finally, we can conclude that the WBXII is an effective alternative to modeling lifetime (see the first data set) and income (see the second data set) data, especially when they present power-law tails. It is quite competitive to the classical Weibull distribution and other BXII generalizations.

CONCLUDING REMARKS
The five-parameter Weibull Burr XII distribution is introduced and studied in detail. The proposed model extends at least twenty lifetime distributions, including new ones. Its hazard rate function can be increasing, decreasing, upside-down bathtub, and bathtub-shaped. It is also very flexible in terms of the density function, which has several forms including left-skewed, right-skewed, reversed-J, and bimodal. We emphasize that a shiny application is developed to provide interactive plots and illustrate the behavior of those functions for several parameter combinations. Some mathematical properties of the proposed model are presented, including the ordinary and incomplete moments, quantile and generating functions, mean deviations, stress-strength reliability, and order statistics. We estimate the model parameters using maximum likelihood, present the components of the score vector and the profile log-likelihood. We also derive a general result for the Weibull-G family, which presents a semi-closed form for the maximum likelihood estimator of the parameter α. We provide applications to real lifetime and income data sets. They illustrate the usefulness of the proposed distribution for modeling these kinds of data and also prove empirically that the WBXII distribution is quite competitive to other known Burr XII and Weibull generalizations.