Acessibilidade / Reportar erro

On the moment-determinacy of power Lindley distribution and some applications to software metrics

Abstract

The Lindley distribution and its numerous generalizations are widely used in statistical and engineering practice. Recently, a power transformation of Lindley distribution, called the power Lindley distribution, has been introduced by M. E. Ghitany et al. who initiated the investigation of its properties and possible applications. In this article, new results on the power Lindley distribution are presented. The focus of this work is on the moment-(in)determinacy of the distribution for various values of the parameters. Afterwards, certain applications are provided to describe data sets of software metrics.

Key words
Power Lindley distribution; moment problem; Stieltjes class; software metrics

1 - Introduction

Nowadays, new families of probability distributions are being proposed by a large number of authors with the aim to provide appropriate tools to study the tendencies in the behavior of data sets emerging in financial mathematics, medical research, computer science, engineering, and other disciplines. See, for example, Ghitany et al. 2013GHITANY ME, AL-MUTAIRI DK, BALAKRISHNAN N & AL-ENEZI LJ. 2013. Power Lindley distribution and associated inference. Comput Stat Data Anal 64: 20-33., Koutras et al. 2014KOUTRAS VM, DRAKOS K & KOUTRAS MV. 2014. Apolynomial logistic distribution and its application in finance. Special Issue: Advances in Probability and Statistics. Communications in Statistics: Theory and Methods 43: 2045-2065.. Using a variety of criteria and approaches, researchers are seeking distributions to best match experimental data.

The Lindley distribution was introduced in 1958 by D. V. Lindley, see Lindley 1958LINDLEY DV. 1958. Fiducial distributions and Bayes’ theorem. J Royal Stat Soc Series B 20: 102-107.. Yet, it continues to draw attention within mathematics and its applications, giving rise to new extensions and modifications. See, for example, Arslan et al. 2017ARSLAN T, ACITAS S & SENOGLU B. 2017. Generalized Lindley and power Lindley distributions for modeling the wind speed data. Energy Convers Manag 152(15): 300-311., Bakouch et al. 2012BAKOUCH HS, AL-ZAHRANI BM, AL-SHOMRANI AA, MARCHI VAA & LOUZADA F. 2012. An extended Lindley distribution. J Korean Stat Soc 41: 75-85., Ghitany et al. 2008GHITANY ME, ATIEH B & NADARAJAH S. 2008. Lindley distribution and its application. Math Comput Simulat 78: 493-506., and references therein. The Lindley distribution with parameter β>0 is defined by the probability density function (PDF) of the form:

f ( x ) = β 2 β + 1 ( 1 + x ) e β x , x > 0 . (1)

Formula (1) shows that the Lindley distribution is a two-component mixture of the exponential and two-stage Erlang distributions with the mixing proportion p=β/(β+1). The distributions of this form come out in reliability theory, for example, in the study of imperfect fault coverage with the probability p of the replacement failure. A comprehensive study of the Lindley distribution and its role in the reliability theory is performed in Ghitany et al. 2008GHITANY ME, ATIEH B & NADARAJAH S. 2008. Lindley distribution and its application. Math Comput Simulat 78: 493-506.. It can be observed that the Lindley distribution as well as the gamma distribution belong to the family of Kummer distributions. The latter was first introduced in 1993 by Armero and Bayarri for conducting a statistical analysis of M/M/ systems. See Armero & Bayarri 1993ARMERO C & BAYARRI MJ. 1993. A Bayesian analysis of a queueing system with unlimited service. Department of Statistics, Purdue University. Tech Rep 93-50., 1997ARMERO C & BAYARRI MJ. 1997. A Bayesian analysis of a queueing system with unlimited service. J Stat Plan Inf 58: 241-261.. The study of the Kummer distribution was followed up in Ng & Kotz 1995NG KW & KOTZ S. 1995. Kummer-Gamma and Kummer-Beta univariate and bivariate distributions. Tech Rep 84. The University of Hong Kong, Department of Statistics, p. 1-20., where new results on the subject were obtained and the assortment of the Kummer-type distributions was expanded. The current paper deals with the properties and applications of the power Lindley distribution, which represents the class of p-Kummer distributions introduced in Ostrovska & Turan 2017OSTROVSKA S & TURAN M. 2017. On the powers of the Kummer distribution. Kuwait J Sci 44: 1-8.. The power Lindley distribution was put forth in Ghitany et al. 2013GHITANY ME, AL-MUTAIRI DK, BALAKRISHNAN N & AL-ENEZI LJ. 2013. Power Lindley distribution and associated inference. Comput Stat Data Anal 64: 20-33. as follows.

Definition 1.1. The power Lindley distribution with parameters α,β>0 is defined by its PDF function:

f ( x ) = α β 2 β + 1 ( 1 + x α ) x α 1 e β x α , x > 0 . (2)

We write XPL(α,β) to indicate that a random variable X has a power Lindley distribution with parameters α and β. Evidently, when α=1, one recovers a Lindley distribution with PDF (1). Observe that X has a Lindley distribution with parameter β if and only if X1/αPL(α,β). That is, the power Lindley distribution occurs naturally as a power transformation of a random variable possessing Lindley distribution. Along with that, power Lindley distribution can also be viewed as a particular case of the p-Kummer distribution, whose PDF is given in Definition 2 of Ostrovska & Turan 2017OSTROVSKA S & TURAN M. 2017. On the powers of the Kummer distribution. Kuwait J Sci 44: 1-8. in the form:

f p ( x ) = x a / p 1 ( 1 + x 1 / p ) c exp ( b x 1 / p ) p Γ ( a ) U ( a , a c + 1 , b ) , a , b , p > 0 , c , x > 0 .

Here, Γ is Euler’s gamma-function

Γ ( z ) = 0 t z 1 e t d t , R e ( z ) > 0 ,

and U is Kummer’s function of the second kind

U ( α , β , z ) = 1 Γ ( α ) 0 e z t t α 1 ( 1 + t ) β α 1 d t , R e ( z ) > 0 .

For further information on the functions, one may refer to Abramowitz & Stegun 1972ABRAMOWITZ M & STEGUN IA. 1972. Handbook of mathematical functions with formulas, graphs, and mathematical tables. New York: Dover Publications. 1046 p., formulae 6.1.1, page 255 and 13.2.5 page 505. Obviously, XPL(α,β) if and only if it has p-Kummer distribution with p=1/α and the parameters a=1, b=β and c=1.

This paper aims to pursue the study of the power Lindley distribution initiated in Ghitany et al. 2013GHITANY ME, AL-MUTAIRI DK, BALAKRISHNAN N & AL-ENEZI LJ. 2013. Power Lindley distribution and associated inference. Comput Stat Data Anal 64: 20-33.. Specifically, the moment-(in)determinacy for different values of parameters will be determined. It has to be noticed that the moment-(in)determinacy of a probability distribution is an important factor not only in probability theory, but also in applied areas, see McGraw et al. 1998MCGRAW R, NEMESURE S & SCHWARTZ SE. 1998. Properties and evolution of aerosols with size distributions having identical moments. J Aerosol Sci 29: 761-772., Stoyanov 2016STOYANOV J. 2016. Moment Properties of Probability Distributions Used in Stochastic Financial Models. Recent Advances in Financial Engineering 2014: 1-27.. Moreover, the increasing role of heavy-tailed distributions in financial, engineering and computer science research - as is shown in, for example, Ferreira et al. 2012FERREIRA KAM, BIGONHA MAS, BIGONHA RS, MENDES LFO & ALMEIDA HC. 2012. Identifying thresholds for object-oriented software metrics. J Syst Softw 85: 244-257., Stojkovski 2017STOJKOVSKI M. 2017. Thresholds for Software Quality Metrics in Open Source Android Projects. Master’s thesis. Norwegian University of Science and Technology, p. 281-294., Stoyanov 2016STOYANOV J. 2016. Moment Properties of Probability Distributions Used in Stochastic Financial Models. Recent Advances in Financial Engineering 2014: 1-27. - puts additional weight on this subject. In this connection, a few Stieltjes classes for power Lindley distributions will be provided in the event of the moment-indeterminacy. Finally, some applications will be given to the data sets of software metrics.

2 - Main results

It is known (Ghitany et al. 2008GHITANY ME, ATIEH B & NADARAJAH S. 2008. Lindley distribution and its application. Math Comput Simulat 78: 493-506.) that the characteristic function of the Lindley distribution is expressed by:

ϕ ( t ) = β 2 ( β + 1 i t ) ( β + 1 ) ( β i t ) 2

and hence it is analytic for t(β,β), implying that the Lindley distribution is moment-determinate. The situation with the power Lindley distribution is less straightforward, since, for α <1, the characteristic function of PL(α,β) distribution is not analytic at 0. Theorem 2.5 presents a necessary and sufficient condition for the moment-(in)determinacy of the power Lindley distribution.

To begin with, some analytical properties of the characteristic functions of the power Lindley distribution are stated in the next claim.

Theorem 2.1. The characteristic function ϕα,β(t) of a power Lindley distribution is entire of order α/(α1) when α>1, analytic on interval (β,β) when α=1, and is not analytic at 0 otherwise.

Proof. The conditions for the analyticity of the characteristic function can be expressed in terms of the tail function, which for the power Lindley distribution coincides with its survival function S(x). According to Ghitany et al. 2013GHITANY ME, AL-MUTAIRI DK, BALAKRISHNAN N & AL-ENEZI LJ. 2013. Power Lindley distribution and associated inference. Comput Stat Data Anal 64: 20-33., formula (3):

S ( x ) = ( 1 + β β + 1 x α ) e β x α , x > 0 .

By formula (2.2.3) on page 25 of Linnik & Ostrovskii 1977LINNIK YV & OSTROVSKII IV. 1977. Decomposition of random variables and vectors. Translations of Mathematical Monographs, vol. 48. American Mathematical Society, Providence, R. I., 380 p., the characteristic function of the distribution is analytic on (R,R) if and only if its tail function satisfies

S ( x ) = O ( e r x ) , x f o r e a c h r < R . (3)

Clearly, for α>1, condition (3) holds for all R>0, whence in this case the characteristic function ϕα,β(t) is entire, while for α=1, estimate (3) is true only when r <β. As for α <1, condition (3) is violated whatever R>0 is and, therefore, the characteristic function is not analytic at 0.

Since in the case α>1, the characteristic function ϕα,β(t) is entire, its order and type can be evaluated. This will be done with the help of the next assertion contained in Theorem 2.4.4 page 37 of Linnik & Ostrovskii 1977LINNIK YV & OSTROVSKII IV. 1977. Decomposition of random variables and vectors. Translations of Mathematical Monographs, vol. 48. American Mathematical Society, Providence, R. I., 380 p. .

Proposition 2.2. If, for the tail function S(x), the values

κ = lim x ln ln ( 1 / S ( x ) ) ln x

and

λ = lim x ln ( 1 / S ( x ) ) x κ

are finite, then the order ρ and the type σ of the characteristic function satisfy the relations

1 ρ + 1 κ = 1 a n d ( κ λ ) ρ 1 σ ρ = 1 .

Calculating

κ = lim x ln ( β x α ln ( 1 + β / ( β + 1 ) x ) ) ln x = α

and

σ = lim x β x α ln ( 1 + β / ( β + 1 ) x ) x α = β ,

one derives ρ=α/(α1) and σ=α1α(αβ)1/(α1), respectively. ◻

Corollary 2.3. The outcomes of Theorem Theorem 2.1 can be restated in the following way. The moment generating function of the power Lindley distribution with parameters α and β:

  • exists for all real numbers if α>1;

  • exists on interval (β,β) if α=1;

  • does not exist if α <1.

Corollary 2.4. If α1, then PL(α,β) distribution is moment-determinate.

This comes immediately from well-known Cramér’s condition for the moment-determinacy. The case α <1 needs an additional investigation. Notice that in this case the distribution PL(α,β) becomes heavy-tailed. While each light-tailed distribution is uniquely determined by its moments, for heavy-tailed distributions the uniqueness may not hold. Heavy-tailed distributions, many of which are not unique with respect to the moments, are instrumental in stock market modeling and engineering Stoyanov 2016STOYANOV J. 2016. Moment Properties of Probability Distributions Used in Stochastic Financial Models. Recent Advances in Financial Engineering 2014: 1-27.. For this reason, the non-uniqueness of the distributions with respect to moments needs deep investigation. The respective findings on the moment-(in)determinacy of the power Lindley distribution are summarized in the next assertion.

Theorem 2.5. The power Lindley distribution is moment-indeterminate if and only if α <1 2.

Proof. In essence, the proof is based on the estimates for the rate of growth of moments. To derive the needed statement, we allude to the following results, in which f(x), x>0 is a PDF of a probability distribution P, whose moment sequence is {mk}k=1.

(A) If mk+1/mk=O(k2) as k, then P is moment-determinate.

(B) If, for some C>0 and ε>0,

m k C k ( 2 + ε ) k , k

and f satisfies Lin’s condition, that is, Lf:=xf(x)/f(x) is monotone increasing for x large enough and limxLf(x)=+, then P is moment-indeterminate.

These results can be found in Lin 2017LIN GD. 2017. Recent developments on the moment problem. J Stat Distributions Appl 4(5). doi:10.1186/s40488-017-0059-2., see Theorem 2(s1) and Theorem 7, respectively.

In the context of this proof, letter C - with or without subscripts - is used to denote positive constant whose value does not need to be evaluated. If XPL(α,β), then the moments of X have been calculated in Ghitany et al. 2013GHITANY ME, AL-MUTAIRI DK, BALAKRISHNAN N & AL-ENEZI LJ. 2013. Power Lindley distribution and associated inference. Comput Stat Data Anal 64: 20-33. as follows:

m k = 𝐄 [ X k ] = k Γ ( k / α ) [ α ( β + 1 ) + k ] α 2 β k / α ( β + 1 ) , k . (4)

Hence, if α1/2, then

m k + 1 m k C Γ ( k / α + 2 ) Γ ( k / α ) = C ( k / α ) ( k / α + 1 ) = O ( k 2 ) , k ,

and according to (A), the distribution is moment-determinate.

To examine the case α <1 2, we write using (4):

m k = C Γ ( k / α + 1 ) ( k / α + β + 1 ) β k / α .

Applying Stirling’s formula, one has:

m k = C 1 k 3 / 2 exp { k α ln k C 2 k } f o r s o m e C 1 , C 2 > 0 .

Since α <1 2, writing 1/α=2+2ε, one obtains:

m k C 1 k 3 / 2 exp { ( 2 + 2 ε ) k ln k C 2 k } = C 1 k 3 / 2 k ( 2 + ε ) k exp { ε k ln k C 2 k } .

As εklnkC2k+ as k, it follows that

m k C 3 k ( 2 + ε ) k k . (5)

To show that the distribution is moment-indeterminate, the estimate (5) has to be supplemented by checking whether the density (2) satisfies Lin’s condition. Plain calculations yield:

L f ( x ) = α x α / ( 1 + x α ) ( α 1 ) + α β x α α β x α + a s x .

In addition,

L f ( x ) = α 2 β x α 1 [ 1 + o ( 1 ) ] a s x ,

implying that Lf(x)>0 for x large enough. Thus, (B) implies that, for α <1 2, distribution PL(α,β) is moment-indeterminate. The proof is complete. ◻

When a probability distribution is moment-indeterminate, the problem arises to expose different distributions with the same moments of all orders. In this paper, this will be done by presenting Stieltjes classes for the density (2), which are infinite families of PDFs having the same moments of all orders. Although the Stieltjes classes per se can be traced to the works of P. L. Chebyshev, T. Stieltjes, and C. Heyde, the name itself is quite recent. To pay tribute to the contribution of Stieltjes to the moment problem, J. Stoyanov on page 282 of his work (Stoyanov 2004STOYANOV J. 2004. Stieltjes classes for moment-indeterminate probability distributions. J Appl Probab 41A: 281-294.) suggested the name ‘Stieltjes classes,’ thus triggering their systematic study, which is still in progress. See, for example Lin 2017LIN GD. 2017. Recent developments on the moment problem. J Stat Distributions Appl 4(5). doi:10.1186/s40488-017-0059-2., Ostrovska 2014OSTROVSKA S. 2014. Constructing Stieltjes classes for M-indeterminate absolutely continuous probability distributions. ALEA, Lat Am J Probab Math Stat 11(1): 253-258., Pakes 2007PAKES AG. 2007. Structure of Stieltjes classes of moment-equivalent probability laws. J Math Anal Appl 326(2): 1268-1290. and references therein.

For the convenience of readers, we supply the necessary definitions below.

Definition 2.1. Let f(x) be a PDF of a random variable X with finite moments of all orders, and let h(x) be an integrable function on (,) such that supx|h(x)|=1. If, for all k0,

x k h ( x ) f ( x ) d x = 0 ,

then h(x) is called a perturbation function of the density f(x).

Definition 2.2. Let f(x) be a PDF and h(x) be a perturbation function of f(x). The set

S = S ( f , h ) := { f ϵ ( x ) : f ϵ ( x ) = f ( x ) [ 1 + ϵ h ( x ) ] , x , ϵ [ 1 , 1 ] }

is said to be a Stieltjes class for f(x) based on h(x).

Obviously, S is an infinite family of densities all having the same sequence of moments as f(x). Observe that, for a density function f(x), there are different Stieltjes classes based on various perturbation functions h(x). The next statement provides a few perturbation functions for (2).

Theorem 2.6. The following functions are perturbations for PDF (2) in the case α <1 2:

  1. H 1 ( x ) = M 1 x 1 α 1 + x α exp ( β x α ) sin [ 2 β x α tan ( π α ) ] ;

  2. H2(x)=M2x1α1+xαexp(βxαbxγ)sin[bxγtan(πγ)], where b>0,γ(α,1/2);

  3. H 3 ( x ) = M 3 sin [ β x α tan ( π α ) π α ] + x α sin [ β x α tan ( π α ) 2 π α ] 1 + x α ,

where Hi(x)=0 for x <0 and constants Mi are chosen in such a way that supx|Hi(x)|=1,i=1,2,3.

Proof. Since all functions Hi satisfy supx|Hi(x)|=1, what is left is to show that

0 x k f ( x ) H i ( x ) d x = 0 , k 0 , i = 1 , 2 , 3 . (6)

For this purpose, the identities below given in formulae 3.944, 9 and 10 on page 502 of Gradshteyn & Ryzhik 2015GRADSHTEYN IS & RYZHIK IM. 2015. Table of integrals, series, and products. 8th ed. Amsterdam: Elsevier/Academic Press, 1133 p. will be used:

0 x p 1 e q x sin ( q x tan t ) d x = Γ ( p ) q p cos p t sin ( p t ) , p , q > 0 , | t | < π 2 (7)

and

0 x p 1 e q x cos ( q x tan t ) d x = Γ ( p ) q p cos p t cos ( p t ) , p , q > 0 , | t | < π 2 . (8)

Denote:

J i ( k ) := β + 1 α β 2 M i 0 x k f ( x ) H i ( x ) d x , i = 1 , 2 , 3 .

Then, the substitution xxα yields

J 1 ( k ) = 1 α 0 x ( k + 1 ) / α 1 e 2 β x sin ( 2 β x tan ( π α ) ) d x .

Setting p=(k+1)/α, q=2β, and t=πα, one derives from (7)

J 1 ( k ) = Γ ( p ) α q p cos p ( π α ) sin ( ( k + 1 ) π ) = 0 , k 0 .

Observe that (7) is applicable because p, q>0 and t=πα(0,π/2) by the condition on α.

Likewise, to justify (ii), using the substitution xxγ, we write:

J 2 ( k ) = 1 γ 0 x ( k + 1 ) / γ 1 e b x sin ( b x tan ( π γ ) ) d x .

This is an integral of the form (7), where p=(k+1)/γ, q=b, and t=πγ. Hence J2(k)=0 as desired.

Finally, in the case (iii), integral J3(k) can be split as

0 x k + α 1 e β x α sin [ β x α tan ( π α ) π α ] d x + 0 x k + 2 α 1 e β x α sin [ β x α tan ( π α ) 2 π α ] d x = : U ( k ) + V ( k ) .

The substitution xxα leads to:

U ( k ) = cos ( π α ) α 0 x k / α e β x sin ( β x tan ( π α ) ) d x sin ( π α ) α 0 x k / α e β x cos ( β x tan ( π α ) ) d x .

Applying formulae (7) and (8) with p=k/α+1,q=β, and t=πα, one derives that

U ( k ) = cos ( π α ) α Γ ( p ) q p cos p ( t ) sin ( p t ) sin ( π α ) α Γ ( p ) q p cos p ( t ) cos ( p t ) = Γ ( p ) α q p cos p ( t ) sin ( p t π α ) = Γ ( p ) α q p cos p ( t ) sin ( k π ) = 0 , k 0 .

Similarly, with the help of the same substitution xxα, one obtains

V ( k ) = cos ( 2 π α ) α 0 x k / α + 1 e β x sin ( β x tan ( π α ) ) d x sin ( 2 π α ) α 0 x k / α + 1 e β x cos ( β x tan ( π α ) ) d x .

Taking p=k/α+2, q=β, and t=πα, we obtain that

V ( k ) = Γ ( p ) α q p cos p ( t ) sin ( p t 2 π α ) = Γ ( p ) α q p cos p ( t ) sin ( k π ) = 0 , k 0 .

Corollary 2.7. Let f be a PDF for PL(α,β) distribution with α <1 2. Then, the following sets are Stieltjes classes for f:

S i = { f ϵ ( x ) : f ϵ ( x ) = f ( x ) [ 1 + ϵ H i ( x ) ] , x , ϵ [ 1 , 1 ] } , i = 1 , 2 , 3 .

3 - Application to software metrics

Software metrics are objective measurements of software products used to assess the quality of the products. These days, a variety of software metrics are being proposed related to different parameters such as the size (of software as a whole or size of its inherent classes and methods), complexity (of software system, classes, methods), internal and external quality characteristics of a software system. Correspondingly, an ample amount of data on the values of software metrics were collected and, as a result, a statistical analysis of such data has become in demand within engineering studies. See, for example, Ferreira et al. 2012FERREIRA KAM, BIGONHA MAS, BIGONHA RS, MENDES LFO & ALMEIDA HC. 2012. Identifying thresholds for object-oriented software metrics. J Syst Softw 85: 244-257., Mishra & Mishra 2011MISHRA D & MISHRA A. 2011. Object-Oriented Inheritance Metrics in the Context of Cognitive Complexity. Fundam Inform 111(1): 91-117., and Stojkovski 2017STOJKOVSKI M. 2017. Thresholds for Software Quality Metrics in Open Source Android Projects. Master’s thesis. Norwegian University of Science and Technology, p. 281-294. where one can find an extensive list of references. In some problems related to software metrics, such as creating catalogues for threshold values, it is important to find probability distributions that best fit the empirical data. In the literature, the two-parameter Weibull distribution has been indicated as a useful instrument for this purpose, while new distributions are being offered by statisticians aiming to provide better tools for specific practical problems.

In this section, we implement the power Lindley distribution to data arrays provided to the authors as a courtesy by M. Stojkovski 2017STOJKOVSKI M. 2017. Thresholds for Software Quality Metrics in Open Source Android Projects. Master’s thesis. Norwegian University of Science and Technology, p. 281-294., who collected the data related to 17 unique categories and, in each category, calculated the values of the following 5 metrics:

  • CBO (Coupling Between Objects)

  • DIT (Depth of Inheritance Tree)

  • NOC (Number Of Children)

  • NOM (Number Of Methods)

  • RFC (Response For Class)

In this article, the data related to DIT and NOC metrics are considered. These metrics were introduced in Chidamber & Kemerera 1994CHIDAMBER SR & KEMERERA CF. 1994. A metrics suite for object oriented design. IEEE Trans Software Eng 20(6): 476-493. in order to measure complexity and coupling. The other data sets available in Stojkovski 2017STOJKOVSKI M. 2017. Thresholds for Software Quality Metrics in Open Source Android Projects. Master’s thesis. Norwegian University of Science and Technology, p. 281-294. can be analyzed likewise.

In the next two examples, the MATLAB software was used and the method of least squares was applied to fit the power Lindley density.

Example 3.1 (DIT system metric). DIT represents the maximum length of the path, as a number of graph edges, from a node to the root of the inheritance tree. It is known that the greater DIT value is, the higher the complexity of a design becomes. The data collected in Stojkovski 2017STOJKOVSKI M. 2017. Thresholds for Software Quality Metrics in Open Source Android Projects. Master’s thesis. Norwegian University of Science and Technology, p. 281-294. can be summarized in Table I.

Using the method of least squares, these data were approximated by the power Lindley density with α=1.1913,β=1.6979. Also, for comparison, we used the fitted Weibull distribution found in Stojkovski 2017STOJKOVSKI M. 2017. Thresholds for Software Quality Metrics in Open Source Android Projects. Master’s thesis. Norwegian University of Science and Technology, p. 281-294. with the help of the EasyFit software. Also, the error of approximation in each case was obtained. Table II summarizes the results and Figure 1 shows the data along with the fitted curves.

Table I
DIT in system category.
Table II
DIT in system category.
Table III
NOC in system category.
Figure 1
Fitted distributions for DIT-system category.

Example 3.2 (NOC system metric). NOC represents the number of immediate subclasses of a class in the hierarchy, measuring the number of subclasses inheriting the methods of the parent class. It is known that when NOC rises, so does re-use. The highlights of the data collected in Stojkovski 2017STOJKOVSKI M. 2017. Thresholds for Software Quality Metrics in Open Source Android Projects. Master’s thesis. Norwegian University of Science and Technology, p. 281-294. appear in Table III.

It can be observed that the behaviour of this data set is essentially different from that of DIT. The data set possesses a strong right-skewed pattern, where the frequency of 0 dominates all of the other frequencies.

Like before, the method of least squares was applied and the outcomes along with the fitted Weibull distribution found in Stojkovski 2017STOJKOVSKI M. 2017. Thresholds for Software Quality Metrics in Open Source Android Projects. Master’s thesis. Norwegian University of Science and Technology, p. 281-294. employing the EasyFit software are placed in Table IV and Figure 2 and 3.

Table IV
NOC in system category.
Figure 2
Fitted distributions for NOC-system category.
Figure 3
Data and fitted densities on different intervals. category.

4 - Conclusion

This work is a continuation of the study on power Lindley distribution, initiated in Ghitany et al. 2013GHITANY ME, AL-MUTAIRI DK, BALAKRISHNAN N & AL-ENEZI LJ. 2013. Power Lindley distribution and associated inference. Comput Stat Data Anal 64: 20-33.. The goal of the current research is to obtain new results on the distribution and provide some novel applications. Since the power Lindley distribution becomes heavy-tailed when α <1 - and, consequently, does not possess a moment-generating function - the examination of its moment-(in)determinacy in this case has to be carried out. This is precisely the main outcome of this paper, stating that PL(α,β) distribution is moment-indeterminate if and only if α <1 2. Several Stieltjes classes have been constructed for this case.

Furthermore, this paper has discussed certain applications dealing with real data sets pertinent to the values of software metrics. Software metrics are currently a hot topic in software engineering as they address quality standards followed by the software developers. The two-parameter Weibull distribution is commonly used to fit experimental data sets of software metrics. In this research, using the data collected in Stojkovski 2017STOJKOVSKI M. 2017. Thresholds for Software Quality Metrics in Open Source Android Projects. Master’s thesis. Norwegian University of Science and Technology, p. 281-294. for DIT and NOC metrics, it is shown that, for certain data sets, power Lindley distribution provides a better description of the data than Weibull distribution, not only for the light- but also for the heavy-tailed case. It has to be pointed out that both distributions are two-parameter, and therefore, similar in terms of the complexity of the models. As for future work, it is planned to perform a similar data analysis for other software metrics and find new threshold values in collaboration with respective specialists.

ACKNOWLEDGMENTS

The authors express their sincere gratitude to Dr. Deepti Mishra (NTNU) for consulting them on the software metrics and to Mr. Mile Stojkovski for providing the collected data sets along with relevant references. Also, our thanks go to Mr. P. Danesh from the Atilim University Academic Writing and Advisory Center for his help in the presentation of the manuscript. Finally, our deep gratitude is extended to the anonymous referees for their valuable comments.

REFERENCES

  • ABRAMOWITZ M & STEGUN IA. 1972. Handbook of mathematical functions with formulas, graphs, and mathematical tables. New York: Dover Publications. 1046 p.
  • ARMERO C & BAYARRI MJ. 1993. A Bayesian analysis of a queueing system with unlimited service. Department of Statistics, Purdue University. Tech Rep 93-50.
  • ARMERO C & BAYARRI MJ. 1997. A Bayesian analysis of a queueing system with unlimited service. J Stat Plan Inf 58: 241-261.
  • ARSLAN T, ACITAS S & SENOGLU B. 2017. Generalized Lindley and power Lindley distributions for modeling the wind speed data. Energy Convers Manag 152(15): 300-311.
  • BAKOUCH HS, AL-ZAHRANI BM, AL-SHOMRANI AA, MARCHI VAA & LOUZADA F. 2012. An extended Lindley distribution. J Korean Stat Soc 41: 75-85.
  • CHIDAMBER SR & KEMERERA CF. 1994. A metrics suite for object oriented design. IEEE Trans Software Eng 20(6): 476-493.
  • FERREIRA KAM, BIGONHA MAS, BIGONHA RS, MENDES LFO & ALMEIDA HC. 2012. Identifying thresholds for object-oriented software metrics. J Syst Softw 85: 244-257.
  • GHITANY ME, AL-MUTAIRI DK, BALAKRISHNAN N & AL-ENEZI LJ. 2013. Power Lindley distribution and associated inference. Comput Stat Data Anal 64: 20-33.
  • GHITANY ME, ATIEH B & NADARAJAH S. 2008. Lindley distribution and its application. Math Comput Simulat 78: 493-506.
  • GRADSHTEYN IS & RYZHIK IM. 2015. Table of integrals, series, and products. 8th ed. Amsterdam: Elsevier/Academic Press, 1133 p.
  • KOUTRAS VM, DRAKOS K & KOUTRAS MV. 2014. Apolynomial logistic distribution and its application in finance. Special Issue: Advances in Probability and Statistics. Communications in Statistics: Theory and Methods 43: 2045-2065.
  • LIN GD. 2017. Recent developments on the moment problem. J Stat Distributions Appl 4(5). doi:10.1186/s40488-017-0059-2.
  • LINDLEY DV. 1958. Fiducial distributions and Bayes’ theorem. J Royal Stat Soc Series B 20: 102-107.
  • LINNIK YV & OSTROVSKII IV. 1977. Decomposition of random variables and vectors. Translations of Mathematical Monographs, vol. 48. American Mathematical Society, Providence, R. I., 380 p.
  • MCGRAW R, NEMESURE S & SCHWARTZ SE. 1998. Properties and evolution of aerosols with size distributions having identical moments. J Aerosol Sci 29: 761-772.
  • MISHRA D & MISHRA A. 2011. Object-Oriented Inheritance Metrics in the Context of Cognitive Complexity. Fundam Inform 111(1): 91-117.
  • NG KW & KOTZ S. 1995. Kummer-Gamma and Kummer-Beta univariate and bivariate distributions. Tech Rep 84. The University of Hong Kong, Department of Statistics, p. 1-20.
  • OSTROVSKA S. 2014. Constructing Stieltjes classes for M-indeterminate absolutely continuous probability distributions. ALEA, Lat Am J Probab Math Stat 11(1): 253-258.
  • OSTROVSKA S & TURAN M. 2017. On the powers of the Kummer distribution. Kuwait J Sci 44: 1-8.
  • PAKES AG. 2007. Structure of Stieltjes classes of moment-equivalent probability laws. J Math Anal Appl 326(2): 1268-1290.
  • STOJKOVSKI M. 2017. Thresholds for Software Quality Metrics in Open Source Android Projects. Master’s thesis. Norwegian University of Science and Technology, p. 281-294.
  • STOYANOV J. 2004. Stieltjes classes for moment-indeterminate probability distributions. J Appl Probab 41A: 281-294.
  • STOYANOV J. 2016. Moment Properties of Probability Distributions Used in Stochastic Financial Models. Recent Advances in Financial Engineering 2014: 1-27.

Publication Dates

  • Publication in this collection
    24 Sept 2021
  • Date of issue
    2021

History

  • Received
    11 May 2019
  • Accepted
    08 Sept 2019
Academia Brasileira de Ciências Rua Anfilófio de Carvalho, 29, 3º andar, 20030-060 Rio de Janeiro RJ Brasil, Tel: +55 21 3907-8100 - Rio de Janeiro - RJ - Brazil
E-mail: aabc@abc.org.br