ABSTRACT
The Half-Normal distribution has been intensively extended in the recent years. A review of the literature showed that at least 10 extensions of the Half-Normal distribution were introduced between 2008 and 2016. These extensions generalized the behavior of the density and hazard functions, which are restricted to monotonous decreasing and monotonically increasing, respectively. In this paper we propose a new extension called the transmuted Half-Normal distribution using the quadratic rank transmutation map, introduced by Shaw & Buckley (200938 SHAW W & BUCKLEY IRC. 2009. The alchemy of probability distributions: Beyond Gram-Charlier expansions, and a skew-kurtotic-normal distribution from a rank transmutation map. arXiv:0901.0434v1 [q-fin.ST], pp. 1-8.). A comprehensive account of mathematical properties of the new distribution is presented. We provide explicit expressions for the moments, moment-generating function, Shannon’s entropy, mean deviations, Bonferroni and Lorenz curves, order statistics, and reliability. The estimation of the parameters is implemented by the maximum likelihood method. The bias and accuracy of the estimators are assayed by the Monte Carlo simulations. This proposed distribution allows us to incorporate covariates directly in the mean and consequently to quantify their influences on the average of the response variable. Experiment with two real data sets show usefulness and its value as a good alternative to several extensions of the Half-Normal distribution in data modeling with and without covariates.
Keywords:
Half-Normal distribution; moments; parametric estimation; precipitation data; transmutation
1 INTRODUCTION
Underlying any parametric inference procedure a probability distribution is used to describe the behavior of a random variable in the population. Over the last years, an uncountable number of probability distributions, mostly having support of positive real numbers, have been proposed in the literature. Many strategies can be used to generate or extend a probability distribution. Most of these strategies adds one or more parameters to some basic distribution (Normal, Gumbel, Exponential, Weibull, Range, Log-Normal, among many others). In general, introducing one or more parameters brings greater flexibility in the behavior of the density and hazard functions of the distributions. A recent survey of the main methods used to extend a baseline distribution is available, for example, in Nadarajah & Rocha (201626 NADARAJAH S & ROCHA R. 2016. Newdistns: An R Package for New Families of Distributions. Journal of Statistical Software, 69(1): 1-32.), Tahir & Nadarajah (201543 TAHIR MH & NADARAJAH S. 2015. Parameter induction in continuous univariate distributions: Well-established G families. Journal of Probability and Statistics, 87(2): 539-568.), de Brito et al. (201513 DE BRITO CCR, RÊGO LC & DE OLIVEIRA WR. 2015. Method for Generating Distributions and Classes of Probability Distributions: The Univariate Case. arXiv:1504.01062, pp. 1-50.), Aljarrah et al. (20141 ALJARRAH MA, LEE C & FAMOYE F. 2014. On generating T-X family of distributions using quantile functions. Journal of Statistical Distributions and Applications, 1(1): 1-17.), Lee et al. (201322 LEE C, FAMOYE F & ALZAATREH AY. 2013. Methods for generating families of univariate continuous distributions in the recent decades. Wiley Interdisciplinary Reviews: Computational Statistics, 5(3): 219-238.), Lai (201120 LAI DC. 2011. Constructions and applications of lifetime distributions. Applied Stochastic Models in Business and Industry, 29(2): 127-140.) and Gupta & Kundu (200917 GUPTA RD & KUNDU D. 2009. Introduction of Shape/Skewness Parameter(s) in a Probability Distribution. Journal of Applied Statistical Science, 7(2): 153-171.).
It is important to emphasize that the transformation of a random variable Z into another X , of the form , is the simplest way to generate or extend a base probability distribution. An example of a distribution obtained by transformation is the one parameter Half-Normal (HN) distribution, resulting from the transformation , where Z has standard Normal distribution and is a scale parameter.
It should be mentioned here that the Half-Normal distribution is a special case of Nakagami-m distribution introduced by Nakagami (196027 NAKAGAMI N. 1960. The m-distribution a general formulation of intensity distribution of rapid fading. Proc. Symp. Statist. Methods Radio Wave Propag.). In the recent years several extensions of the Half-Normal distribution were proposed. These extensions include: the general Half-Normal distribution (Pewsey, 200231 PEWSEY A. 2002. Large-sample inference for the general Half-Normal distribution. Communications in Statistics Theory and Methods, 31(7): 1045-1054., 200432 PEWSEY A. 2004. Improved likelihood based inference for the general Half-Normal distribution. Communications in Statistics Theory and Methods, 33(2): 197-204.), the generalized Half-Normal distribution (Cooray & Ananda, 20087 COORAY K & ANANDA MMA. 2008. A generalization of the Half-Normal distribution with applications to lifetime data. Communications in Statistics - Theory and Methods, 37(9): 1323-1337.), the Beta (log-Beta) generalized Half-Normal distribution (Pescim et al., 201029 PESCIM RR, DEMÉTRIO CGB, CORDEIRO GM, ORTEGA EMM & URBANO MR. 2010. The Beta generalized Half-Normal distribution. Computational Statistics & Data Analysis, 54(4): 945-957.; Cordeiro et al., 201310 CORDEIRO GM, PESCIM RR, ORTEGA EMM & DEMÉTRIO CGB. 2013. The Beta generalized Half-Normal distribution: New properties. Journal of Probability and Statistics, 2013: 1-18.; Pescim et al., 201330 PESCIM RR, ORTEGA EMM, CORDEIRO GM & DEMÉTRIO CGB. 2013. The Log-Beta Generalized Half-Normal Regression Model. Journal of Statistical Theory and Applications, 12(4): 330-347.), the Kumaraswamy generalized Half-Normal distribution (Cordeiro et al., 20129 CORDEIRO GM, PESCIM RR & ORTEGA EMM. 2012. The Kumaraswamy Generalized Half-Normal Distribution for Skewed Positive Data. Journal of Data Science, 10: 195-224.), the Beta generalized Half-Normal geometric distribution (Ramires et al., 201333 RAMIRES TG, ORTEGA EMM, CORDEIRO GM & HAMEDANI GG. 2013. The Beta generalized Half-Normal geometric distribution. Studia Scientiarum Mathematicarum Hungarica, 50(4): 523-554.), the extension generalized Half-Normal distribution (Olmos et al., 201428 OLMOS NM, VARELA H, BOLFARINE H & GÓMEZ HW. 2014. An extension of the generalized Half-Normal distribution. Statistical Papers, 55(4): 967-981.), the Power Half-Normal distribution (Gómez & Bolfarine, 201516 GÓMEZ YM & BOLFARINE H. 2015. Likelihood-based inference for the power HalfNormal distribution. Journal of Statistical Theory and Applications, 14(4): 383-398.), the generalized Half-Normal extended (Sanchez et al., 201635 SANCHEZ JJD, FREITAS WWL & CORDEIRO GM. 2016. The extended generalized Half-Normal distribution. Brazilian Journal of Probability and Statistics, 30: 366-384.), the Odd Log-Logistic generalized Half-Normal (Cordeiro et al., 20178 CORDEIRO GM, ALIZADEH M, PESCIM RR & ORTEGA EMM. 2017. The odd log-logistic generalized Half-Normal lifetime distribution: Properties and applications. Communications in Statistics Theory and Methods, 46(9): 4195-4214.) and the transmuted generalized Gamma distribution (Saboor et al., 201934 SABOOR A, KHAN MN, CORDEIRO GM, PASCOA MA, RAMOS PL & KAMAL M. 2019. Some new results for the transmuted generalized gamma distribution. Journal of Computational and Applied Mathematics, 352: 165-180.).
This article introduces the transmuted Half-Normal distribution (THN), derived from the HN distribution. The extension of a baseline distribution via transmutation was proposed by Shaw & Buckley (200938 SHAW W & BUCKLEY IRC. 2009. The alchemy of probability distributions: Beyond Gram-Charlier expansions, and a skew-kurtotic-normal distribution from a rank transmutation map. arXiv:0901.0434v1 [q-fin.ST], pp. 1-8.). Tahir & Cordeiro (201642 TAHIR MH & CORDEIRO GM. 2016. Compounding of distributions: A survey and new generalized classes. Journal of Statistical Distributions and Applications, 3(1): 3-13.) have enumerated 51 distributions extended by transmutation. In Section 2 we present the HN distribution used as the baseline distribution to be extended. Its transmuted version is presented in Section 3. Various statistical and reliability properties of the THN are explored and discussed in Section 4. A few characterizations are considered in Section 5. Estimation by the maximum likelihood method is presented in Section 6. In Section 7, Monte Carlo simulations are conducted to investigate the bias and accuracy the maximum likelihood estimators. Two applications considering the proposed distribution are presented in Section 8. Section 9, with some concluding remarks, closes the paper.
2 THE HALF-NORMAL DISTRIBUTION
If a nonnegative random variable X has HN distribution with scale parameter , then the probability density function (p.d.f.) and the cumulative distribution function (c.d.f.) are written, respectively, as:
where ϕ(·) and Φ(·) denote, respectively, the p.d.f. and c.d.f. of distribution of a random variable with standard Normal distribution. The corresponding hazard rate function (h.r.f.) is written as:
which is monotonically increasing for every θ. It is worth remembering that p.d.f. is monotonically decreasing for all θ.
Formally, if a random variable Z is normally distributed with mean zero and variance one, then the HN distribution is the distribution of . For Z normally distributed with mean µ and variance one, the transformation leads to Leone et al. (196123 LEONE FC, NELSON LS & NOTTINGHAM RB. 1961. The Folded Normal Distribution. Technometrics, 3(4): 543-550.); Tsagris et al. (201444 TSAGRIS M, BENEKI C & HASSANI H. 2014. On the Folded Normal Distribution. Mathematics, 2(1): 12-28.). The folded-Normal distribution is a special case of the Normal distribution whenever the sign of the measured variable is unknown, lost or not relevant. In other words, it is used when one is interested in the size of the measured variable and not in the direction or sign (Chou & Liu, 19986 CHOU CY & LIU HR. 1998. Properties of the Half-Normal distribution and its application to quality control. Journal of Industrial Technology, 14(3): 4-7.). It is clear that the HN distribution is a particular case of the folded-Normal distribution when , which also coincides with the normal distribution truncated at zero (Nadarajah & Kotz, 2006b25 NADARAJAH S & KOTZ S. 2006b. R Programs for Truncated Distributions. Journal of Statistical Software, 16(1): 1-8.). In addition, the expressions in (1) can be obtained with the particular case of the generalized Rayleigh distribution (Vodã, 1976a45 VODÃ VG. 1976a. Inferential procedures on a generalized Rayleigh variate I. Aplikace Matematiky, 21(6): 395-412.,b46 VODÃ VG. 1976b. Inferential procedures on a generalized Rayleigh variate II. Aplikace Matematiky, 21(6): 413-419.), of the generalized range distribution Stacy (196241 STACY EW. 1962. A generalization of the Gamma distribution. The Annals of Mathematical Statistics, 33(3): 1187-1192.) and the square root of a chi-square random variable with one degree of freedom (Johnson et al., 199419 JOHNSON NL, KOTZ S & BALAKRISHNAN N. 1994. Continuous univariate distributions. Vol. 1. Second ed.. New York: John Wiley & Sons Inc.).
For the k-th moment of X is such that for we have and for . The third and fourth standardized moments (asymmetry and kurtosis, respectively) do not depend on θ and are written as:
The quantile function can be written in the form:
where and Φ−1(·) is the quantile function of a standard Normal distribution. Note that from (2) we can generate pseudo-random values of X . An alternative strategy can be found in Singh (199439 SINGH R. 1994. Simulation of Observations for the Half-Normal Distribution. Sankhyã: The Indian Journal of Statistics, Series B, 56(2): 137-139.) as a consequence of the application of the Box-Muller transformation Singh (199540 SINGH R. 1995. Editorial Note on the Paper “Simulation of observations for the HalfNormal distribution”. Sankhyã: The Indian Journal of Statistics, Series B, 57(3): 461- 461.).
3 THE TRANSMUTED HALF-NORMAL DISTRIBUTION
Motivated by the need for more versatile density and hazard functions, Shaw & Buckley (200938 SHAW W & BUCKLEY IRC. 2009. The alchemy of probability distributions: Beyond Gram-Charlier expansions, and a skew-kurtotic-normal distribution from a rank transmutation map. arXiv:0901.0434v1 [q-fin.ST], pp. 1-8.) proposed a strategy that has proved useful in extending a baseline distribution. The distribution obtained by this process is called the transmuted-baseline distribution, for example, transmuted Weibull distribution. In this section, from the transmutation procedure, we will introduce the transmuted HN distribution (THN).
From the strategy proposed by Shaw & Buckley (200938 SHAW W & BUCKLEY IRC. 2009. The alchemy of probability distributions: Beyond Gram-Charlier expansions, and a skew-kurtotic-normal distribution from a rank transmutation map. arXiv:0901.0434v1 [q-fin.ST], pp. 1-8.), a random variable X is said to have a transmuted distribution if its cdf is written in the form:
Consequently:
where and are respectively the c.d.f. and p.d.f. of the baseline distribution, indexed by a parameter vector θ. For , we have the distribution of the base random variable as a particular case.
Therefore, a non-negative random variable X has THN distribution, with scale parameter and if its respective p.d.f. and c.d.f. are given, respectively, by
and
Figure 1 illustrates how the parameter λ influences the behavior of (3). Since θ is the scale parameter it is set to .
It can be checked that the cdf of THN(θ, λ) is a convex combination (finite mixture) of the cdf of maximum and minimum of two i.i.d HN(θ) random variables by writing as
Consequently the following result follows
Therefore a THN(θ, λ) distribution is stochastically larger (smaller) than the distribution of the minimum (maximum) of two i.i.d. HN(θ) random variables.
4 STATISTICAL AND RELIABILITY PROPERTIES
In this section, we provide some important statistical and reliability properties of the THN distribution.
4.1 Survival, hazard rate and residual life functions
The survival function describes the probability of an item or individual surviving the time x. For the THN distribution we have:
The hazard rate function (h.r.f.) specifies the instantaneous rate of death or failure at time x, given that the individual has survived up to x. Mathematically we have consequently for THN
Figure 2 illustrates how the parameter λ influences the behavior of (6). Since θ is the scale parameter it is set to .
The h.r.f. of THN(θ, λ) can written in terms of that of HN(θ) as
where is the hazard function of the HN(θ) distribution.
Using the above relation we can see that for all
and for all
That is THN for all and THN for all .
In fact it can be shown that HN for all and HN for all .
Moreover if and then if .
The mean residual life for individuals at age t is the average remaining life time and corresponds to the ratio of the area under the survival curve to the right of t and . From (5) we have:
Theorem 4.1. The mean residual life of the THN distribution is given by:
where.
Proof. In fact, by replacing the equation (5) in (7), we obtain
Solving the integral, we obtain:
From algebraic manipulations we arrive at the following expression:
Replacing , we get the result. □
4.2 Asymptotic behavior of the tails
The behavior of THN(θ, λ) in the tails for and for can be stated respectively as follows:
and
4.3 Quantile function
The p th quantile of THN(θ, λ) is given by
4.4 Moments and associated measures
Moments are measures capable of characterizing a probability distribution, with the first four moments indicative of central tendency, dispersion, asymmetry and kurtosis in that order. Unlike many extensions of the HN distribution, the k-th moment of the THN distribution can be obtained analytically.
Theorem 4.2. If X has THN distribution then the k-th moment can be written as follows:
where Hgeo is the Hypergeometric distribution Feller (1968 14 FELLER W. 1968. An introduction to probability theory and its applications. vol. 1. 3rd ed.. John Wiley & Sons. ).
Proof. From the moment of the order k of a continuous random variable definition:
which can be written as:
From (8) we have:
The expressions of the coefficient of variation, skewness and kurtosis are obtained from the following relations:
and
4.5 Moment generating function
Theorem 4.3. If X has THN distribution then the moment generating function of X, M X (t), is given by:
Proof. Let X be a random variable with THN distribution, then the moment generating function of X is given by:
From the Taylor series expansion of the function e tx we have that (10) can be written in the form:
Solving integrals, we get:
□
4.6 Differential entropy
Here we investigate differential entropy of a continuous random variable. This is a measure of the uncertainty variation, and a large entropy value indicates a greater uncertainty in the data. One of the most popular measures is the Shannon entropy Shannon (195137 SHANNON CE. 1951. Prediction and entropy of printed English. Bell Labs Technical Journal, 30(1): 50-64.).
The concept of Shannon’s entropy refers to the uncertainty of a probability distribution. An important fact is that the entropy ℋ Sh is not a function of the random variable X , but rather of the probability distribution of that variable.
Definition 4.1. The differential entropy ℋ Sh of a continuous random variable X with a probability density function f (x) is defined as
where S is the support set of the random variable.
Theorem 4.4. The Shannon’s entropy for a continuous random variable X with THN distribution is given by:
Proof. In fact, it follows directly from (3) that
Using the distributive properties of the logarithmic function we have:
Since the functions in question are all integrable, we have
Solving the integrals, we have that the Shannon’s entropy for the THN distribution is given by:
4.7 Mean deviations
The amount of dispersion in a population can be measured by deviations around the mean and median, defined by:
where µ and M denote the mean and median, respectively.
To calculate these measures we can use the following relations presented in Nadarajah & Kotz (2006a24 NADARAJAH S & KOTZ S. 2006a. The beta exponential distribution. Reliability Engineering & System Safety, 91(6): 689-697.):
where F(µ) and F(M) can be calculated according to the equation (4). Taking the f (x) as the p.d.f. of the THN distribution, we have:
In an analogous way we obtain .
4.8 Bonferroni and Lorenz curves
The Bonferroni and Lorenz curves proposed by Bonferroni (19305 BONFERRONI CE. 1930. Elementi di statistica generale. Libereria Seber Firenze.), commonly used in areas such as reliability, demography, economics, medicine and insurance, are applications of the mean deviations and are considered by economists as a measure of social inequality, since they relate the accumulated percentages of income and population . The Bonferroni and Lorenz curves are defined as:
Definition 4.2. Suppose that X is a nonnegative random variable with probability density function f (x) and cumulative distribution function F(x). The Bonferroni and Lorenz curves denoted by B(p) and L(p), respectively, are defined as:
on whatand.
In particular, for the THN distribution we have:
Theorem 4.5. Let X be a nonnegative continuous random variable that has THN distribution. The Bonferroni and Lorenz curves are given by:
and
Proof. In fact, by applying the THN distribution we have
□
In an analogous way we obtain the Lorenz curve.
4.9 Order statistics
Order statistics, as well as sample moments, play an important role in statistical inference (David & Nagaraja, 200312 DAVID HA & NAGARAJA HN. 2003. Order statistics. 3rd ed.. Wiley Series in Probability and Statistics. John Wiley & Sons.). Order statistics moments play an important role in quality control and reliability testing to predict the failure of future items based on the times of some initial failures.
Let X (1) , . . . , X (n) be the order statistics of a random sample X 1 ,..., X n obtained from a population following a distribution f(x), then the p.d.f. of the j-th order statistic is given by:
for .
Applying the equations (3) and (4) we obtain the density of the j-th order statistic of the THN.
Considering the substitution we have:
Reordering the terms we obtain the general formula for density of the j-th order statistic THN:
We can use the THN distribution to model maximum or minimum events, so in what follows we obtain the densities of the maximum and the minimum order statistic. For the order statistic n, we have to replace j = n in the general formula (12). Therefore, the n-th order statistic representing the distribution of the maximum of the THN distribution has the p.d.f. is given by:
For first order statistic, we replace in the general formula j = 1. The result generates the p.d.f. of the minimum of the THN distribution given by:
4.10 Extreme Values
The HN(θ) distribution belongs to the max domain of attraction of the Gumbel extreme value distribution. Hence, there must exist a strictly positive function, say h(t), such that
for every (Leadbetter et al., 198721 LEADBETTER MR, LINDGREN G & ROOTZÉ NH. 1987. Extremes and related properties of random sequences and processes. 1st ed.. Springer Science & Business Media.).
Then it can be shown that
for every . Therefore THN(θ, λ) belongs to the max domain of attraction of the Gumbel extreme value distribution with
for some suitable norming constants and , where X n:n is the maximum order statistic.
4.11 Stress strength reliability
Stress strength reliability estimation is of great interest in engineering, being used in stress-force models or as a measure of performance in electrical and electronic systems. However, it can also be applied in other areas, as it allows a general measure of the differences between two populations Asgharzadeh et al. (20113 ASGHARZADEH A, VALIOLLAHI R & RAQAB MZ. 2011. Stress-strength reliability of Weibull distribution based on progressively censored samples. SORT-Statistics and Operations Research Transactions, 35(2): 103-124.).
Theorem 4.6.Letand, are independent then
Proof. Using the p.d.f. and c.d.f. defined in (3) and (4) we have:
Replacing by , we have to
Applying the distributive law, using properties of integration and isolating terms independent of x, we get:
On solving the integrals we arrive at the desired result. □
Note that for the case where , it immediately follows that
5 CHARACTERIZATION
Here we discuss characterizations of THN(θ, λ) distribution based on: (i) simple relationship between two truncated moments and (ii) maximum and minimum order statistics.
5.1 Characterizations by two truncated moments
Here we present a characterization of THN(θ, λ) distribution based on a simple relationship between two truncated moments using the following theorem due to Glänzel (198715 GLÄNZEL W. 1987. A Characterization Theorem Based on Truncated Moments and its Application to Some Distribution Families. Dordrecht: Springer Netherlands.) and used in Hamedani et al. (201718 HAMEDANI GG, CORDEIRO GM, LIMA MCS & NASCIMENTO ADC. 2017. Some Extended Classes of Distributions: Characterizations and Properties. Pakistan Journal of Statistics and Operation Research, 13(4): 893-908.) and Yousof et al. (201747 YOUSOF HM, ALIZADEH M, JAHANSHAHI SMA, RAMIRES TG, GHOSH I & HAMEDANI G. 2017. The Transmuted Topp-Leone G Family of Distributions: Theory, Characterizations and Applications. Journal of Data Science, 15(4): 723-740.). This theorem also holds if the interval I is not closed and also when the cd f F is not in a compact form. This characterization is stable in the sense of weak convergence.
Theorem 5.1.Let (Ω, ℱ, P) be a probability space andbe an interval for some(might as well be allowed) . Letbe a continuous random variable with the cd f F and let w 1 and w 2 be two real valued functions defined on I such that
is defined for some real function ξ. Assume thatand F are twice continuously differentiable and strictly monotone function on H. More over, assume that the equationhas no real solution in the interior of I. Then F is uniquely determined by the functions w1, w2and ξ , particularly
where the function s is a solution of the differential equation and K is the normalization constant, such that .
Proposition 5.1.Letbe a continuous random variable and letandfor. Then the random variable X follows THN(θ, λ) with p.d.f. in (3) if and only if the function ξ defined in Theorem 5.1 has the form
Proof. Let X be a random variable with pdf (3), then
and
hence
Conversely, if , then
and hence
Now, in view of Theorem 5.1, X has density (2) . □
Corollary 5.1.Letbe a continuous random variable and let w1(x) be as in Proposition 5.1. Then pdf of X follows THN(θ, λ) with p.d.f. in (3) if and only if there exist functions w2and ξ defined in Theorem 5.1 satisfying the differential equation
Remark: The general solution of the differential equation in Corollary 5.1 is
where C is a constant. The set of functions given in Proposition 5.1 satisfies this differential equation when C = 0. It need to be noted that there are other triplets (w 1 , w 2 , ξ) satisfying the conditions of Theorem 5.1.
5.2 Characterizations by order statistics
Here we use two results from Hamedani et al. (201718 HAMEDANI GG, CORDEIRO GM, LIMA MCS & NASCIMENTO ADC. 2017. Some Extended Classes of Distributions: Characterizations and Properties. Pakistan Journal of Statistics and Operation Research, 13(4): 893-908.) stated in the theorem below to characterize THN(θ, λ) distribution by maximum (X n:n ) and minimum (X 1:n ) order statistics from a sample of size n from THN(θ, λ) distribution.
Theorem 5.2.Letbe a continuous random variable with cdf F and let ξ(x) and w(x) be two differentiable functions in (0, ∞) such that
-
(i),
-
then
-
implies
-
(ii),
-
then
-
implies
Proposition 5.2. If we consider and in Theorem 5.2(i). Then the random variable X follows THN(θ, λ) with c.d.f.in (4).
Proposition 5.3.If we considerandin Theorem 5.2(ii). Then the random variable X follows THN(θ, λ) with c.d.f. in (4).
6 MAXIMUM LIKELIHOOD ESTIMATOR
Let be a random sample from the THN distribution with p.d.f. expressed by (3). Then, the log-likelihood function, a part from constant terms, can be written as:
The MLEs and of θ and λ, respectively may be obtained by maximization of (13), or solving the following likelihood equations:
where .
7 SIMULATION STUDY
In this section we present the results of a Monte Carlo simulation used to evaluate the bias and mean square error of the estimates obtained by the maximum likelihood method. Samples of size n = 20, 50,..., 170, 200 with λ = −0.9, −0.7,..., 0.7, 0.9 and θ = 1.0 were generated. For each of the combinations of n, θ and λ the inverse transformation method was used to generate, N = 10.000 pseudo-random samples of the THN. The results are reported in the Figures 3 and 4.
Estimated bias and mean square error of θ (θ = 1, 1 : λ = −0.9, 2 : λ = −0.7, 3 : λ = −0.5, 4 : λ = −0.3, 5 : λ = 0, 6 : λ = 0.3, 7 : λ = 0.5, 8 : λ = 0.7 and 9 : λ = 0.9).
Estimated bias and mean square error of λ (θ = 1, 1 : λ = −0.9, 2 : λ = −0.7, 3 : λ = −0.5, 4 : λ = −0.3, 5 : λ = 0, 6 : λ = 0.3, 7 : λ = 0.5, 8 : λ = 0.7 and 9 : λ = 0.9).
In analyzing the bias of , we note that for values of λ = −0.9, −0.7, −0.5, −0.3, 0, 0.3, 0.5, presented an excellent estimate converging to zero even for small samples. Although for λ = 0.7 and 0.9 the convergence of is a little more time consuming, we have observed that with the increase in the sample size the estimated bias is very low, around −0.04 and −0.08 respectively.
We also note that for the values of λ = −0.9, −0.7, −0.5, −0.3, 0, 0.3, 0.5, is estimated accurately. Like in the case of θ , λ = 0.7, 0.9 also has a slower convergence, with the estimated bias close to −0.11 and −0.02, respectively.
Though the bias are very low for both the estimated parameters, it is comparatively higher for . Moreover higher the amplitude of λ higher is the bias in the estimates of both the parameters showing that the transmutation parameter λ , exerts influence in the estimation of the scale parameter θ.
The mean square errors of are extremely low, with the positive values of λ the ones that present a more precise convergence. Although the positive λ parameters do not converge directly to zero, we can note from the graph that the errors in the estimate are less than 0.03. Thus we can conclude that the errors in the estimates are practically insignificant.
Finally, when we observe the mean square error of the parameter , we realize that θ exerts influence on its estimation. Analogously to the previous one, we can see that, for all scenarios, the errors tend to zero, the furthest being close to 0.1.
In general, we can conclude that the estimators have the property of asymptotic unbiasedness, since the bias tends to zero as n increases, while trend in the mean squared error show the consistency, because when the value of n increases the errors tend to zero.
8 REAL DATA ANALYSIS
In this section, we illustrate the applicability of THN distribution using two real data sets that were not been analyzed before in the literature. Our objective is to evaluate its adjustment in relation to other distributions already presented in the literature.
In both applications the data used were compiled from the daily series of daily precipitation obtained at the portal of the National Institute of Meteorology (http://www.inmet.gov.br). In the adjustment of distributions, the cumulative total for the month was considered as a response. The cumulative total monthly, quarterly, and so forth is widely used in the calculation of the Standardized Precipitation Index (SPI).
8.1 Monthly Precipitation
For the first application, we consider the data for the month of February of each year, from 1974 to 2016, of the city of Chapecó localized in the state of Santa Catarina. It is important to note that due to some fault it does not record measurement in the month of February of some years in the considered period. Table 1 shows the summary measures of the data used. Looking at Figure 5, it is possible to note that the risk is increasing, an indication that the THN distribution may be an appropriate model for adjustment.
For this data set, we fit the following models:
-
1. Half-Normal (HN) due to Daniel (195911 DANIEL C. 1959. Use of Half-Normal plots in interpreting factorial two-level experiments. Technometrics, 1(4): 311-341.).
-
2. General Half-Normal (GHN) due to Pewsey (200231 PEWSEY A. 2002. Large-sample inference for the general Half-Normal distribution. Communications in Statistics Theory and Methods, 31(7): 1045-1054., 200432 PEWSEY A. 2004. Improved likelihood based inference for the general Half-Normal distribution. Communications in Statistics Theory and Methods, 33(2): 197-204.).
-
3. Power Half-Normal (PHN) due to Gómez & Bolfarine (201516 GÓMEZ YM & BOLFARINE H. 2015. Likelihood-based inference for the power HalfNormal distribution. Journal of Statistical Theory and Applications, 14(4): 383-398.).
-
4. Generalized Half-Normal (GHN II) due to Cooray & Ananda (20087 COORAY K & ANANDA MMA. 2008. A generalization of the Half-Normal distribution with applications to lifetime data. Communications in Statistics - Theory and Methods, 37(9): 1323-1337.).
-
5. Gamma Half-Normal (GMHN) due to Alzaatreh & Knight (20132 ALZAATREH A & KNIGHT K. 2013. On The Gamma-Half Normal Distribution and Its Applications, 12(1): 103-119.).
-
6. Odd Log-Logistic Generalized Half-Normal (OLLGHN) due to Cordeiro et al. (20178 CORDEIRO GM, ALIZADEH M, PESCIM RR & ORTEGA EMM. 2017. The odd log-logistic generalized Half-Normal lifetime distribution: Properties and applications. Communications in Statistics Theory and Methods, 46(9): 4195-4214.).
-
7. Transmuted Generalized Half-Normal (TGHN).
-
8. Beta Generalized Half-Normal (BGHN) due to Pescim et al. (201029 PESCIM RR, DEMÉTRIO CGB, CORDEIRO GM, ORTEGA EMM & URBANO MR. 2010. The Beta generalized Half-Normal distribution. Computational Statistics & Data Analysis, 54(4): 945-957.).
-
9. Extended Generalized Half-Normal (EGHN) due to Sanchez et al. (201635 SANCHEZ JJD, FREITAS WWL & CORDEIRO GM. 2016. The extended generalized Half-Normal distribution. Brazilian Journal of Probability and Statistics, 30: 366-384.).
-
10. Kumaraswamy Generalized Half-Normal (KGHN) due to Cordeiro et al. (20129 CORDEIRO GM, PESCIM RR & ORTEGA EMM. 2012. The Kumaraswamy Generalized Half-Normal Distribution for Skewed Positive Data. Journal of Data Science, 10: 195-224.).
-
11. Beta Generalized Half-Normal Geometric (BGHNG) due to Ramires et al. (201333 RAMIRES TG, ORTEGA EMM, CORDEIRO GM & HAMEDANI GG. 2013. The Beta generalized Half-Normal geometric distribution. Studia Scientiarum Mathematicarum Hungarica, 50(4): 523-554.).
-
where .
Table 2 shows the maximum likelihood estimates (standard errors) of the fitted distribution. All estimates were obtained in SAS/NLMIXED procedure SAS (201036 SAS. 2010. The NLMIXED Procedure, SAS/STAT® User’s Guide, Version 9.22. Cary, NC: SAS Institute Inc.), applying the Newton-Raphson optimization technique. Although, for all distributions, the resulting variancecovariance matrices were positive definite and , we observed atypical standard errors for some parameters in the GMHN, BGHN, EGHN and KGHN distributions. Our guess is that they converge to a local minimum or, most likely, that the parameters are linear functions of each other (or almost collinear) on the data in question. In fact, from the correlation matrices we obtain and for the GMHN distribution, and for the BGHN distribution, and for the EGHN distribution, and for the KGHN distribution. It was not possible to estimate the standard error of all the parameters of the BGHNG distribution and its variance-covariance matrix was also not completely filled, but we can observe that and .
To compare the distributions, we consider the statistics based on the likelihood −2×Log-Like, AIC, AICc and BIC and the measures of good adjustment KS, AD and CvM. The best model is one that provides the minimum values of these criteria. Table 3 shows such values and the index indicates the classification obtained for each distribution. We also have in the last column a total rating (sum of ratings) for each distribution. Since there are large uncertainties in the estimates of the GMHN, BGHN, EGHN, KGHN and BGHNG distributions, they will not be considered. When looking at the table, we can see that the THN distribution ranked first, followed by the TGHN and PHN distribution. Both distributions, THN and PHN, have two parameters. The TGHN distribution has three parameters, and the additional parameter (λ ) has zero in its respective confidence interval. In view of this, we can infer that the models that obtained the best fit were THN (17) and PHN (24).
8.2 Ten-Day Precipitation
The data consists of the ten-day accumulated precipitation between April and June in the station WMO 83498 localized in the state of Bahia, Brazil. We have considered the historical series from 1961 to 2017. In this application we adopted a regression structure for the scale parameter, that is,
where x i denote the ith observation associate the ith ten-day period. The periods of ten days were considered for the months of April, May, June and July. We have considered the distributions HN, GHN II and THN since only they have closed analytical expressions for the mean.
In Table 4 are reported the parameter estimates and standard errors. Since the has a negative sign for all distributions, decrease in the accumulated precipitation is indicated as the time passes. The empirical and estimated means (95% confidence intervals) are presented in Table 5 and Figure 6. Also, in Table 6 are present several criteria to discriminate between the HN, GHN II and THN distributions. From these results we can conclude that GHN II and THN provide better fit than the HN distribution. It is also observed that THN has the lowest values of those criteria.
9 CONCLUSION
In this paper, we present the THN distribution formulated from the quadratic transmutation proposed by Shaw & Buckley (200938 SHAW W & BUCKLEY IRC. 2009. The alchemy of probability distributions: Beyond Gram-Charlier expansions, and a skew-kurtotic-normal distribution from a rank transmutation map. arXiv:0901.0434v1 [q-fin.ST], pp. 1-8.). Some characteristics and mathematical properties of the proposed distribution are studied. It is important to note that the moment generating function, moment of order k, mean, variance, asymmetry and kurtosis have explicit analytic expressions, which depend only of the parameters of the THN distribution. Due to the simplicity of the distribution, it was possible to calculate the uncertainty measures as a Shannon entropy and the mean deviations. As the Bonferroni and Lorenz curves as well as the reliability characteristic are presented, giving opportunities to areas such as engineering to extract benefits from the use of the proposed distribution. Distribution of order statistic was calculated for the distribution of THN, as well as their respective expressions for the densities of the maximum and minimum distribution. A Monte Carlo simulation study showed that the parameters are efficiently estimated by maximum likelihood method and show a low bias and a low accuracy even for a small sample sizes, which indicates the potential that the new distribution provides for modeling. In the first application using daily precipitation data and 11 models proposed in the literature (derived from the HN distribution) were placed in competition with the distribution proposed in this work. Five models were withdrawn from the analysis because of their inconsistent estimates and by looking at the statistics based on likelihood −2×Log-Like, AIC, AICc and BIC, as well as KS, AD and CvM, the THN distribution showed a better fit in comparison with the models used. Since a distribution presents an explicit and simple expression for the mean, it was possible to use it for an application using regression structure. When observing the −2×Log-Like, AIC, BIC and SSR criteria, we noticed that the THN distribution presented a good of fit, reinforcing its supremacy when compared to the models used here.
It is important to mention that during the peer review process the THN distribution was considered by Balaswamy (20184 BALASWAMY S. 2018. Transmuted Half Normal Distribution. International Journal of Scientific Research in Mathematical and Statistical Sciences, 5(4): 163-170.). Although both work propose the same distribution, we emphasize that in our paper a more comprehensive account of mathematical properties of the new distribution was presented (survival, hazard rate and residual life functions and their properties; asymptotic behavior of the tails; moments, associated measures and moment generating function; differential entropy; mean deviations; Bonferroni and Lorenz curves; order statistics; extreme values; stress strength reliability; characterizations by two truncated moments; characterizations by order statistics). In addition, we studied the bias and accuracy of the parameters estimated by the maximum likelihood method and illustrated the applicability of THN distribution using two real data sets that were not been analyzed before in the literature.
References
-
1ALJARRAH MA, LEE C & FAMOYE F. 2014. On generating T-X family of distributions using quantile functions. Journal of Statistical Distributions and Applications, 1(1): 1-17.
-
2ALZAATREH A & KNIGHT K. 2013. On The Gamma-Half Normal Distribution and Its Applications, 12(1): 103-119.
-
3ASGHARZADEH A, VALIOLLAHI R & RAQAB MZ. 2011. Stress-strength reliability of Weibull distribution based on progressively censored samples. SORT-Statistics and Operations Research Transactions, 35(2): 103-124.
-
4BALASWAMY S. 2018. Transmuted Half Normal Distribution. International Journal of Scientific Research in Mathematical and Statistical Sciences, 5(4): 163-170.
-
5BONFERRONI CE. 1930. Elementi di statistica generale. Libereria Seber Firenze.
-
6CHOU CY & LIU HR. 1998. Properties of the Half-Normal distribution and its application to quality control. Journal of Industrial Technology, 14(3): 4-7.
-
7COORAY K & ANANDA MMA. 2008. A generalization of the Half-Normal distribution with applications to lifetime data. Communications in Statistics - Theory and Methods, 37(9): 1323-1337.
-
8CORDEIRO GM, ALIZADEH M, PESCIM RR & ORTEGA EMM. 2017. The odd log-logistic generalized Half-Normal lifetime distribution: Properties and applications. Communications in Statistics Theory and Methods, 46(9): 4195-4214.
-
9CORDEIRO GM, PESCIM RR & ORTEGA EMM. 2012. The Kumaraswamy Generalized Half-Normal Distribution for Skewed Positive Data. Journal of Data Science, 10: 195-224.
-
10CORDEIRO GM, PESCIM RR, ORTEGA EMM & DEMÉTRIO CGB. 2013. The Beta generalized Half-Normal distribution: New properties. Journal of Probability and Statistics, 2013: 1-18.
-
11DANIEL C. 1959. Use of Half-Normal plots in interpreting factorial two-level experiments. Technometrics, 1(4): 311-341.
-
12DAVID HA & NAGARAJA HN. 2003. Order statistics. 3rd ed.. Wiley Series in Probability and Statistics. John Wiley & Sons.
-
13DE BRITO CCR, RÊGO LC & DE OLIVEIRA WR. 2015. Method for Generating Distributions and Classes of Probability Distributions: The Univariate Case. arXiv:1504.01062, pp. 1-50.
-
14FELLER W. 1968. An introduction to probability theory and its applications. vol. 1. 3rd ed.. John Wiley & Sons.
-
15GLÄNZEL W. 1987. A Characterization Theorem Based on Truncated Moments and its Application to Some Distribution Families. Dordrecht: Springer Netherlands.
-
16GÓMEZ YM & BOLFARINE H. 2015. Likelihood-based inference for the power HalfNormal distribution. Journal of Statistical Theory and Applications, 14(4): 383-398.
-
17GUPTA RD & KUNDU D. 2009. Introduction of Shape/Skewness Parameter(s) in a Probability Distribution. Journal of Applied Statistical Science, 7(2): 153-171.
-
18HAMEDANI GG, CORDEIRO GM, LIMA MCS & NASCIMENTO ADC. 2017. Some Extended Classes of Distributions: Characterizations and Properties. Pakistan Journal of Statistics and Operation Research, 13(4): 893-908.
-
19JOHNSON NL, KOTZ S & BALAKRISHNAN N. 1994. Continuous univariate distributions. Vol. 1. Second ed.. New York: John Wiley & Sons Inc.
-
20LAI DC. 2011. Constructions and applications of lifetime distributions. Applied Stochastic Models in Business and Industry, 29(2): 127-140.
-
21LEADBETTER MR, LINDGREN G & ROOTZÉ NH. 1987. Extremes and related properties of random sequences and processes. 1st ed.. Springer Science & Business Media.
-
22LEE C, FAMOYE F & ALZAATREH AY. 2013. Methods for generating families of univariate continuous distributions in the recent decades. Wiley Interdisciplinary Reviews: Computational Statistics, 5(3): 219-238.
-
23LEONE FC, NELSON LS & NOTTINGHAM RB. 1961. The Folded Normal Distribution. Technometrics, 3(4): 543-550.
-
24NADARAJAH S & KOTZ S. 2006a. The beta exponential distribution. Reliability Engineering & System Safety, 91(6): 689-697.
-
25NADARAJAH S & KOTZ S. 2006b. R Programs for Truncated Distributions. Journal of Statistical Software, 16(1): 1-8.
-
26NADARAJAH S & ROCHA R. 2016. Newdistns: An R Package for New Families of Distributions. Journal of Statistical Software, 69(1): 1-32.
-
27NAKAGAMI N. 1960. The m-distribution a general formulation of intensity distribution of rapid fading. Proc. Symp. Statist. Methods Radio Wave Propag.
-
28OLMOS NM, VARELA H, BOLFARINE H & GÓMEZ HW. 2014. An extension of the generalized Half-Normal distribution. Statistical Papers, 55(4): 967-981.
-
29PESCIM RR, DEMÉTRIO CGB, CORDEIRO GM, ORTEGA EMM & URBANO MR. 2010. The Beta generalized Half-Normal distribution. Computational Statistics & Data Analysis, 54(4): 945-957.
-
30PESCIM RR, ORTEGA EMM, CORDEIRO GM & DEMÉTRIO CGB. 2013. The Log-Beta Generalized Half-Normal Regression Model. Journal of Statistical Theory and Applications, 12(4): 330-347.
-
31PEWSEY A. 2002. Large-sample inference for the general Half-Normal distribution. Communications in Statistics Theory and Methods, 31(7): 1045-1054.
-
32PEWSEY A. 2004. Improved likelihood based inference for the general Half-Normal distribution. Communications in Statistics Theory and Methods, 33(2): 197-204.
-
33RAMIRES TG, ORTEGA EMM, CORDEIRO GM & HAMEDANI GG. 2013. The Beta generalized Half-Normal geometric distribution. Studia Scientiarum Mathematicarum Hungarica, 50(4): 523-554.
-
34SABOOR A, KHAN MN, CORDEIRO GM, PASCOA MA, RAMOS PL & KAMAL M. 2019. Some new results for the transmuted generalized gamma distribution. Journal of Computational and Applied Mathematics, 352: 165-180.
-
35SANCHEZ JJD, FREITAS WWL & CORDEIRO GM. 2016. The extended generalized Half-Normal distribution. Brazilian Journal of Probability and Statistics, 30: 366-384.
-
36SAS. 2010. The NLMIXED Procedure, SAS/STAT® User’s Guide, Version 9.22. Cary, NC: SAS Institute Inc.
-
37SHANNON CE. 1951. Prediction and entropy of printed English. Bell Labs Technical Journal, 30(1): 50-64.
-
38SHAW W & BUCKLEY IRC. 2009. The alchemy of probability distributions: Beyond Gram-Charlier expansions, and a skew-kurtotic-normal distribution from a rank transmutation map. arXiv:0901.0434v1 [q-fin.ST], pp. 1-8.
-
39SINGH R. 1994. Simulation of Observations for the Half-Normal Distribution. Sankhyã: The Indian Journal of Statistics, Series B, 56(2): 137-139.
-
40SINGH R. 1995. Editorial Note on the Paper “Simulation of observations for the HalfNormal distribution”. Sankhyã: The Indian Journal of Statistics, Series B, 57(3): 461- 461.
-
41STACY EW. 1962. A generalization of the Gamma distribution. The Annals of Mathematical Statistics, 33(3): 1187-1192.
-
42TAHIR MH & CORDEIRO GM. 2016. Compounding of distributions: A survey and new generalized classes. Journal of Statistical Distributions and Applications, 3(1): 3-13.
-
43TAHIR MH & NADARAJAH S. 2015. Parameter induction in continuous univariate distributions: Well-established G families. Journal of Probability and Statistics, 87(2): 539-568.
-
44TSAGRIS M, BENEKI C & HASSANI H. 2014. On the Folded Normal Distribution. Mathematics, 2(1): 12-28.
-
45VODÃ VG. 1976a. Inferential procedures on a generalized Rayleigh variate I. Aplikace Matematiky, 21(6): 395-412.
-
46VODÃ VG. 1976b. Inferential procedures on a generalized Rayleigh variate II. Aplikace Matematiky, 21(6): 413-419.
-
47YOUSOF HM, ALIZADEH M, JAHANSHAHI SMA, RAMIRES TG, GHOSH I & HAMEDANI G. 2017. The Transmuted Topp-Leone G Family of Distributions: Theory, Characterizations and Applications. Journal of Data Science, 15(4): 723-740.
Publication Dates
-
Publication in this collection
10 Aug 2020 -
Date of issue
2020
History
-
Received
23 Nov 2018 -
Accepted
28 Mar 2020