Abstract
Abstract: The Kumaraswamy distribution is useful for modeling variables whose support is the standard unit interval, i.e., (0, 1). It is not uncommon, however, for the data to contain zeros and/or ones. When that happens, the interest shifts to modeling variables that assume values in [0, 1), (0, 1] or [0, 1]. Our goal in this paper is to introduce inflated Kumaraswamy distributions that can be used to that end. We consider inflation at one of the extremes of the standard unit interval and also the more challenging case in which inflation takes place at both interval endpoints. We introduce inflated Kumaraswamy distributions, discuss their main properties, show how to estimate their parameters (point and interval estimation) and explain how testing inferences can be performed. We also present Monte Carlo evidence on the finite sample performances of point estimation, confidence intervals and hypothesis tests. An empirical application is presented and discussed.
Key words
Inflated distribution; Kumaraswamy distribution; likelihood ratio test; maximum likelihood estimation; score test; Wald test
INTRODUCTION
Oftentimes practitioners need to model variables that assume values in the standard unit interval, (0, 1), such as rates, proportions and concentration indices. The beta distribution is the most commonly used model in such applications, since its density can assume a wide range of shapes depending on the parameter values. Nonetheless, it was noted by (Kumaraswamy 1976)KUMARASWAMY P. 1976. Sinepower probability density function. J Hydrol 31(1-2): 181-184. that the beta law may fail to fit well hydrological data, especially when the data are hydrological observations of small frequency. He then proposed a new distribution, which can be considered as an alternative to the well known beta model. That distribution is now known as the Kumaraswamy distribution. We say that the random variable is Kumaraswamy-distributed with shape parameters and , denoted by , if its probability density function (pdf) is given by
the corresponding cumulative distribution function (cdf) being . We note that if , then and is exponentially distributed with parameter ; likewise, if , then and is exponentially distributed with parameter .
The Kumaraswamy model has received considerable attention in the recent literature. (Carrasco, Ferrari, and Cordeiro 2010)CARRASCO JMF, FERRARI SLP AND CORDEIRO GM. 2010. A new generalized Kumaraswamy distribution. Technical Report arXiv: 1004.0911v1 [stat.ME]. URL https://arxiv.org/abs/1004.0911v1.
https://arxiv.org/abs/1004.0911v1....
proposed a new five-parameter distribution that generalizes the beta and Kumaraswamy distributions. (Lemonte 2011)LEMONTE AJ. 2011. Improved point estimation for the Kumaraswamy distribution. J Stat Comput Simul 81(12): 1971-1982. obtained nearly unbiased estimators for the parameters that index the Kumaraswamy law. A method for distinguishing between the Kumaraswamy and beta models was proposed by (Silva and Barreto-Souza 2014)SILVA RB AND BARRETO-SOUZA W. 2014. Beta and Kumaraswamy distributions as non-nested hypotheses in the modeling of continuous bounded data. Technical Report arXiv:1406.1941 [stat.ME]. URL https://arxiv.org/abs/1406.1941
https://arxiv.org/abs/1406.1941...
. (Barreto-Souza and Lemonte 2013)BARRETO-SOUZA W AND LEMONTE AJ. 2013. Bivariate Kumaraswamy distribution: properties and a new method to generate bivariate classes. Statistics 47(6): 1321-1342. introduced a bivariate Kumaraswamy distribution for which the marginal distributions are univariate Kumaraswamy laws.
According to (Mitnik and Baek 2013)MITNIK PA AND BAEK S. 2013. The Kumaraswamy distribution: median-dispersion re-parameterizations for regression modeling and simulation-based estimation. Statist Papers 54(1): 177-192., the Kumaraswamy distribution has an advantage relative to beta model: its distribution and quantile functions can be expressed in closed form. That renders, for instance, random number generation based on the inversion method an easy task; see (Jones 2009)JONES MC. 2009. Kumaraswamy’s distribution: a beta-type distribution with some tractability advantages. Stat Methodol 6(1): 70-81.. It is thus, for instance, very easy to generate sequences of pseudo-random numbers from that law using the inversion method. To that end, one only needs to generate a sequence of pseudo-random standard uniform numbers and evaluate the Kumaraswamy quantile function at each value. In contrast, beta random number generation requires the use of acceptance-rejection algorithms, which are more computationally intensive. Additionally, the Kumaraswamy density can assume many different shapes depending on the parameter values, which makes the corresponding law quite flexible for representing rates and proportions. Finally, (Wang, Wang, and Yu 2017)WANG BX, WANG XK AND YU K. 2017. Inference on the Kumaraswamy distribution. Comm Statist Theory Methods 46(5): 2079-2090. note that the Kumaraswamy distribution is particularly useful for modeling variables that describe natural and biological phenomena that are restricted to the standard unit interval.
It is not uncommon, however, for the data to contain zeros and/or ones. When that happens, the interest shifts to modeling variables that assume values in or . The Kumaraswamy distribution cannot be used in such cases since, like the beta law, its support is . (Ospina and Ferrari 2010)OSPINA R AND FERRARI SLP. 2010. Inflated beta distributions. Statist Papers 51(1): 111-126. introduced the class of inflated beta distributions, which allows for the presence of extreme values in the data. In this paper we develop alternative laws: we introduce the class of inflated Kumaraswamy distributions. We consider inflation at one of the endpoints of the standard unit interval and also the more challenging case where inflation takes place at both zero and one, that is, we first consider variables whose support are and and then we consider the double inflation case, i.e., variables that assume values in . Such distributions are obtained by combining the Kumaraswamy distribution (continuous component) with a degenerate or with a couple of degenerate distributions (discrete component).
The paper unfolds as follows.The next section presents the zero or one inflated Kumaraswamy distribution (single inflation). Point and interval estimation are also discussed. Notice that inflation only takes place at a single point. In the following section, we go further and introduce the zero and one inflated Kumaraswamy distribution (double inflation). We also show how to perform point and interval estimation. Next, we focus on hypothesis testing inference. Finally, we present and discuss: (i) Monte Carlo simulation evidence and (ii) an empirical application.
THE ZERO OR ONE INFLATED KUMARASWAMY DISTRIBUTION
Data on rates and proportions may contain zeros and/or ones. When that happens the underlying data generating process contains a discrete component that causes a given value or a couple of specific values to be observed with positive probability. It is thus necessary to combine continuous and discrete data generating mechanisms into a more general law. In what follows, we shall focus on random variables that assume values in (0, 1) but that can also equal with positive probability, where or . We say there is data inflation at one of the standard unit interval endpoints.
We introduce the inflated Kumaraswamy distribution in (), whose cdf is given by
where is an indicator function that equals when and when and is the mixture parameter. Notice that, with probability follows the Kumaraswamy distribution with parameters and, with probability , it follows a degenerate distribution at .
Let be a random variable with cdf given by (2) denoted by . Its pdf is given by
where and are the parameters that index the Kumaraswamy distribution and is the density given in (1). Note that or .
Figure 1 shows different Kumaraswamy densities inflated at and at , for different values of and , with (recall that is the mixture parameter). Note that the probability density function of the inflated Kumaraswamy distribution at given in (3) may assume a wide variety of shapes; e.g., it can be U-shaped, increasing, decreasing, asymmetric to the left, asymmetric to the right, bell-shaped, and even constant.
The th moment of is
where is the th moment of the Kumaraswamy distribution, denoting the gamma function. In particular, the mean and variance of are
where is the beta function.
It is noteworthy that the density function presented in (3) can be written as
The density in (4) is expressed as the product of two terms: the first term only depends on whereas the second term only involves and .
The likelihood function for based on , a IK random sample, is
where
The zero or one inflated Kumaraswamy log-likelihood function is then given by
where
The score function, which is obtained by differentiating the log-likelihood function, is denoted by , where
The maximum likelihood estimator (mle) of is , i.e., it is given by the proportion of sample values that equal . The maximum likelihood estimators of and cannot be expressed in closed-form. They can be obtained, however, by numerically maximizing the log-likelihood function using a nonlinear optimization method, such as a Newton or quasi-Newton method. The BFGS quasi-Newton method is commonly used for numerically maximizing log-likelihood functions; for details on such a method, see (Nocedal and Wright 2006)NOCEDAL J AND WRIGHT SJ. 2006. Numerical optimization. New York: Springer. 2nd ed. and (Press et al. 1992)PRESS WH, TEUKOLSKY SA, VETTERLING WT AND FLANNERY BP. 1992. Numerical recipes in C: the art of scientific computing. New York: Cambridge University Press. 2nd ed..
The Fisher information matrix for the zero or one inflated Kumaraswamy law is
where
Here, is the digamma function and is the trigamma function.
Let denote the mle of ϑ. In large samples is expected to be approximately normally distributed: , where K(ϑ) is the information matrix given in (5) and denotes approximately distributed. Using such a result, it is possible to construct approximate confidence intervals for the model parameters. Let . It follows that asymptotic confidence intervals for and are given, respectively, by and , where denotes standard error and is the standard normal quantile. The standard errors are obtained as square roots of the diagonal elements of the inverse of Fisher’s information matrix after the unknown parameters are replaced with the corresponding maximum likelihood estimates.
ZERO AND ONE INFLATED KUMARASWAMY DISTRIBUTION
The distribution introduced in the previous section is not suitable for modeling fractional data that contain both zeros and ones, i.e., when data inflation occurs at both ends of the standard unit interval. In what follows we shall introduce a distribution that can be used to model variables that have support in . We shall now introduce the appropriate law for that case. We say that the random variable follows the zero and one inflated Kumaraswamy distribution, denoted by , if its cdf is given by
with , where is the mixture parameter and denotes the cumulative distribution function of a Bernoulli random variable with parameter .
It follows that the pdf of is
Note that and . For and .
Figure 2 presents several ZOIK densities for and . Notice the many different shapes that the density can assume. The distribution is thus a very flexible law for variables that assume values in the standard unit interval with inflation at both interval limits.
Let be a zero and one inflated Kumaraswamy random variable. Its th moment is , . Hence,
where and are the first and second Kumaraswamy moments, respectively.
Consider the zero and one inflated Kumaraswamy density given in (6). It is possible to write it as
where now is the indicator function that equals one if and equals zero if . The pdf in (7) factors into three terms: the first term only depends on , the second term only depends on and the third term involves and .
The likelihood function for based on the random sample is
where
The corresponding log-likelihood function can be expressed as
where
The score function is given by , where
The maximum likelihood estimators of and are, respectively, , which is the proportion of discrete values in the sample, and , which is the proportion of degenerate values that equal one.
The Fisher information matrix for the zero and one inflated Kumaraswamy distribution is
where
As before, approximate confidence intervals can be constructed based on the asymptotic normality of , the mle of ϑ. In large samples, it is expected that , where ϑ is the information matrix given in (8). Using such a limiting distribution, it is possible to construct asymptotic confidence intervals for and . For , the asymptotic confidence intervals for such parameters are given, respectively, by and .
HYPOTHESIS TESTING INFERENCE
The asymptotic normality of can also be used to construct hypothesis tests. Suppose the interest lies in making testing inference on a subset of parameters. Let , where is an vector of parameters of interest and is an vector of nuisance parameters. We wish to test the null hypothesis against the alternative hypothesis . The inference can be based on the following criteria: likelihood ratio (), Wald () and score (). For details on these tests, see (Buse 1992)BUSE A. 1992. The likelihood ratio, Wald and Lagrange Multiplier tests: an expository note. Amer Statist 36(3): 153-157. Cox and Hinkley (1979 Chapter 9)COX DR AND HINKLEY DV. 1979. Theoretical statistics. New York: Chapman & Hall/CRC. and Welsh (1996, Section 4.5)WELSH AH. 1996. Aspects of statistical inference. New York: John Wiley & Sons..
Let be the unrestricted maximum likelihood estimator of ϑ and let = be the restricted maximum likelihood estimator of ϑ which is obtained by imposing . The likehood ratio test statistic is given by
the Wald test statistic can be written as
and the score test statistic is
where is the block of Fisher’s information matrix inverse that corresponds to evaluated at , denotes the vector that contains the elements of the score function corresponding to the parameters of interest and is the block of Fisher’s information matrix inverse that corresponds to evaluated at .
Notice that in order to compute one needs to obtain and , i.e., it is necessary to perform both unrestricted and restricted parameter estimation. In contrast, in order to compute one only needs to perform unrestricted estimation and in order to compute one only needs to carry out restricted estimation.
Under and under some regularity conditions outlined by (Serfling 1980)SERFLING RJ. 1980. Approximation theorems of mathematical statistics. New York: John Wiley & Sons. and , where denotes convergence in distribution. The three test statistics thus share the same asymptotic null distribution. The tests are typically carried out using asymptotic (i.e., approximate) critical values. The null hypothesis is rejected at significance level if the selected criterion exceeds , the upper quantile.
Numerical evaluation
In what follows we shall report results of Monte Carlo simulations that were carried out to evaluate the finite sample performances of point estimators, confidence intervals and hypothesis tests. We consider inflation at one and also inflation at both zero and one. The reported results are based on 10,000 replications and were obtained using the Ox matrix programming language; see (Cribari-Neto and Zarkos 2003)CRIBARI-NETO F AND ZARKOS SG. 2003. Econometric and statistical computing using Ox. Comput Econ 21(3): 277-295. and (Doornik 2009)DOORNIK JA. 2009. An object-oriented matrix programming language Ox 6. London: Timberlake Consultants Press.. Log-likelihood maximization was performed using the quasi-Newton BFGS method with analytical first derivatives, which is typically regarded as the best performing method; see Mittelhammer, Judge, and Miller (2000, Section 8.13)MITTELHAMMER RC, JUDGE GG AND MILLER DJ. 2000. Econometric foundations. New York: Cambridge University Press.. The initial values used in the BFGS iterative scheme were arbitrarily selected, being different from the true parameter values. We varied such initial values and noticed that they had little impact on the results.
At the outset we focus on point estimation. TablesI and II contain the variances, relative biases and mean squared errors (MSEs) of the maximum likelihood estimators of the parameters that index the Kumaraswamy distribution with inflation at one and with inflation at zero and one, respectively. Relative bias is computed as the difference between the mean estimate and the true parameter valued divided by the latter. We report results for different sample sizes. The mixture parameter () assumes two values: 0.05 and 0.50. The results show that the relative biases, variances and mean squared errors decay as the sample size increases. The results in TableI show that point estimation of is less accurate when the true parameter value is small. Consider, e.g., . The relative bias of equals 8.60% when and when . It is noteworthy that point estimation of is less accurate than that of and , especially when the value of is large (0.50). This seems to be a characteristic Kumaraswamy maximum likelihood point estimation that is carried over to the new class of inflated distributions. Consider, for instance, the numerical evidence reported by (Lemonte 2011)LEMONTE AJ. 2011. Improved point estimation for the Kumaraswamy distribution. J Stat Comput Simul 81(12): 1971-1982.. Except when the value of is quite small, the numerical evidence in his paper shows that the maximum likelihood estimator of is considerably less accurate than that of both in terms of bias and mean squared error.
Relative biases, variances and MSEs of the maximum likelihood estimators
of the parameters that index the distribution; .
Relative biases, variances and MSEs of the maximum likelihood estimators
of the parameters that index the ZOIK distribution; .
Next, we evaluate the accuracy of interval estimation in finite samples. The confidence intervals empirical coverages and non-coverages are presented in TablesIII (single inflation) and IV (double inflation); entries are percentages. The results show that the empirical coverages approach the nominal ones as the sample size increases. The non-coverages also become better balanced as number of data points is increased. Consider, e.g., and . Under single inflation, the empirical coverage rates for and are, respectively, 94.62%, 94.53% and 96.32%. Under double inflation, the corresponding coverage rates for and are 94.27%, 94.77%, 96.50% and 96.63%. Overall, the confidence intervals display reasonably accurate coverages except the confidence interval for when the true parameter value is very small (, TableIII). For instance, when and , the exact interval coverage was slightly below 86%. For and , the corresponding coverage figures were 89.76% and 91.28%.
Confidence intervals empirical coverages and noncoverages (to the left; to the right) rates (%), distribution; and .
Confidence intervals empirical coverages and noncoverages (to the left; to the right) rates (%), ZOIK distribution; and .
We also carried out simulations to evaluate the finite performances of testing inferences based on the and asymptotic chi-squared criteria. The interest lies in testing for the law. For the ZOIK law, we test and also .
In the former case, and ; in the latter case, for the test on we generated data using and for the test on we performed data generation using . Data generation was performed under the null hypothesis. The significance levels are and . The tests null rejection rates are presented in TablesV (test on law), VI (test on , ZOIK, law) and VII (test on , ZOIK law). Notice that the empirical null rejection rates converge to the corresponding nominal significance levels as the sample size increases. Overall, the likelihood ratio test is the best performing test, i.e., it is typically the least size-distorted test. For example, when () and at the 5% significance level in TableV, the likelihood ratio null rejection rate is 4.44% (5.38%) under single inflation. The corresponding figures for the score and Wald tests are, respectively, 6.35% (5.38%) and 7.05 (5.38%). The null rejection rates of the three tests coincide when ( and ZOIK), even though the test statistics values are slightly different in each replication. The tests become less accurate when they are used to make inference on (TableVII), especially when the value of is small. The tests become more accurate when . Consider, for example, and . The null rejection rates of the likelihood ratio, score and Wald tests are 9.38%, 8.14% and 12.40%.
We have also carried out power simulation, i.e., simulations in which data generation was performed under the alternative hypothesis. For brevity, we shall only report results for the test on in the ZOIK law. Data generation was carried using and when and , respectively. Since no test is very liberal, the tests are performed using asymptotic () critical values. The tests nonnull rejection rates are presented in TableVIII. It is noteworthy that the tests are less powerful when the value of is large. Consider, e.g., and . The estimated powers of the likelihood ratio, score and Wald tests are around 98% whereas for they are around 82%. We also note that the powers of the tests coincide when .
EMPIRICAL APPLICATION
In what follows we shall present an empirical application of the distribution. The variable of interest assumes values in . It is the proportion of inhabitants in each of the Brazilian municipalities that lived in homes with at least one bathroom and piped water in 2010. The data source is the 2013 edition of the Brazilian Atlas of Human Development; http://www.atlasbrasil.org.br/2013. The data contain 73 observations that equal one. TableIX displays some descriptive statistics on the variable of interest. Notice that 75% of the data points exceed and that there if left-skewness.
We fitted the inflated Kumaraswamy () and beta distributions (BEOI), both with inflation at one. The maximum likelihood estimates of the parameters that index that distribution (standard errors in parentheses) are (), () and (). The parameter estimates we obtained for the BEOI law are (), () and (). Again, log-likelihood maximization was performed using the BFGS quasi-Newton method and the Ox matrix programming language. Figure3 contains the data histogram and the fitted density. The fitted BEOI density is not included in the plot because it is very similar to the fitted density, as expected given the large sample size.
We performed the Kolmogorov-Smirnov test; for details on such a test, see Pestman (1998, Section 7.4)PESTMAN WR. 1998. Mathematical statistics. New York: Walter de Gruyter.. The interest lies in determining whether the sample at hand came from the the postulated distribution. The test was performed for each of the two inflated laws. For the inflated Kumaraswamy and beta laws, the test statistics are, respectively, 0.1896 and 0.1938. Even though the null hypothesis is not rejected for both distributions, the fact that the test statistic is smaller for the inflated Kumaraswamy law indicates there is more evidence in favor of the inflated Kumaraswamy distribution relative to the alternative law.
Using the maximum likelihood estimate of ( law), we constructed the asymptotic 95% confidence interval for such a parameter. The lower interval limit is 0.0102 and the upper limit equals 0.0160.
Finally, we tested the following null hypotheses against the corresponding two sided alternative hypotheses ( law): (i) , (ii) and (iii) , the respective likelihood ratio test statistics (-values in parentheses) being 4.9695 (0.0258), 1.3970 (0.2372) and 15.3039 (0.0001). The corresponding score [Wald] figures are 5.4566 (0.0195) [4.1736 (0.0411)], 1.3381 (0.2474) [1.5274 (0.2165)] and 13.4602 (0.0002) [20.3827 ()]. It is thus clear that the second null hypothesis is not rejected at the usual nominal levels, and one can safely take the value of to be 0.015.
CONCLUSIONS
Applied statisticians oftentimes need to model variables that assume values in the standard unit interval, ; e.g., rates, proportions, income inequality indices, etc. The beta and Kumaraswamy distributions are commonly used with such variables. There are instances, however, when the variable of interest may display inflation, i.e., it may equal zero and/or one with positive probability. Put differently, it assumes values in (inflation at zero), (inflation at one) or (inflation at both interval limits). In this paper, we introduced inflated Kumaraswamy distributions that can be used as underlying laws for variables that assume values in those intervals. We considered two separate cases, namely: (i) inflation at zero or one and (ii) inflation at zero and one. For both cases, we introduced the appropriate law and also discussed point estimation, interval estimation and hypothesis testing inference. We presented Monte Carlo simulation evidence on the finite sample performances of point estimates, confidence intervals and hypothesis tests. Finally, an empirical application was presented and discussed.
ACKNOWLEGMENTS
This work was supported by the Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq) under Grant 301651/2017-5. We are also thankful to two anonymous referees whose comments and suggestions led to a much improved manuscript.
REFERENCES
- BARRETO-SOUZA W AND LEMONTE AJ. 2013. Bivariate Kumaraswamy distribution: properties and a new method to generate bivariate classes. Statistics 47(6): 1321-1342.
- BUSE A. 1992. The likelihood ratio, Wald and Lagrange Multiplier tests: an expository note. Amer Statist 36(3): 153-157.
- CARRASCO JMF, FERRARI SLP AND CORDEIRO GM. 2010. A new generalized Kumaraswamy distribution. Technical Report arXiv: 1004.0911v1 [stat.ME]. URL https://arxiv.org/abs/1004.0911v1.
» https://arxiv.org/abs/1004.0911v1. - COX DR AND HINKLEY DV. 1979. Theoretical statistics. New York: Chapman & Hall/CRC.
- CRIBARI-NETO F AND ZARKOS SG. 2003. Econometric and statistical computing using Ox. Comput Econ 21(3): 277-295.
- DOORNIK JA. 2009. An object-oriented matrix programming language Ox 6. London: Timberlake Consultants Press.
- JONES MC. 2009. Kumaraswamy’s distribution: a beta-type distribution with some tractability advantages. Stat Methodol 6(1): 70-81.
- KUMARASWAMY P. 1976. Sinepower probability density function. J Hydrol 31(1-2): 181-184.
- LEMONTE AJ. 2011. Improved point estimation for the Kumaraswamy distribution. J Stat Comput Simul 81(12): 1971-1982.
- MITNIK PA AND BAEK S. 2013. The Kumaraswamy distribution: median-dispersion re-parameterizations for regression modeling and simulation-based estimation. Statist Papers 54(1): 177-192.
- MITTELHAMMER RC, JUDGE GG AND MILLER DJ. 2000. Econometric foundations. New York: Cambridge University Press.
- NOCEDAL J AND WRIGHT SJ. 2006. Numerical optimization. New York: Springer. 2nd ed.
- OSPINA R AND FERRARI SLP. 2010. Inflated beta distributions. Statist Papers 51(1): 111-126.
- PESTMAN WR. 1998. Mathematical statistics. New York: Walter de Gruyter.
- PRESS WH, TEUKOLSKY SA, VETTERLING WT AND FLANNERY BP. 1992. Numerical recipes in C: the art of scientific computing. New York: Cambridge University Press. 2nd ed.
- SERFLING RJ. 1980. Approximation theorems of mathematical statistics. New York: John Wiley & Sons.
- SILVA RB AND BARRETO-SOUZA W. 2014. Beta and Kumaraswamy distributions as non-nested hypotheses in the modeling of continuous bounded data. Technical Report arXiv:1406.1941 [stat.ME]. URL https://arxiv.org/abs/1406.1941
» https://arxiv.org/abs/1406.1941 - WANG BX, WANG XK AND YU K. 2017. Inference on the Kumaraswamy distribution. Comm Statist Theory Methods 46(5): 2079-2090.
- WELSH AH. 1996. Aspects of statistical inference. New York: John Wiley & Sons.
Publication Dates
-
Publication in this collection
23 May 2019 -
Date of issue
2019
History
-
Received
12 Sept 2018 -
Accepted
14 Feb 2019