Acessibilidade / Reportar erro

Practical rules for summing the series of the Tweedie probability density function with high-precision arithmetic

Abstract

Abstract: For some ranges of its parameters and arguments, the series for Tweedie probability density functions are sometimes exceedingly difficult to sum numerically. Existing numerical implementations utilizing inversion techniques and properties of stable distributions can cope with these problems, but no single one is successful in all cases. In this work we investigate heuristically the nature of the problem, and show that it is not related to the order of summation of the terms. Using a variable involved in the analytical proof of convergence of the series, the critical parameter for numerical non-convergence (“alpha”) is identified, and an heuristic criterion is developed to avoid numerical non-convergence for a reasonably large sub-interval of the latter. With these practical rules, simple summation algorithms provide sufficiently robust results for the calculation of the density function and its definite integrals. These implementations need to utilize high-precision arithmetic, and are programmed in the Python programming language. A thorough comparison with existing R functions allows the identification of cases when the latter fail, and provide further guidance to their use.

Key words
Python; R Tweedie package; Tweedie probability density; Tweedie series


Introduction

The Tweedie distribution is a member of the family of exponential dispersion models, discussed for example in Jørgensen (1987)JØRGENSEN B. 1992. The theory of exponential dispersion models and analysis of deviance. 2nd edition. Rio de Janeiro: Instituto de Matemática Pura e Aplicada. and Jørgensen (1992)JØRGENSEN B. 1992. The theory of exponential dispersion models and analysis of deviance. 2nd edition. Rio de Janeiro: Instituto de Matemática Pura e Aplicada.. The Tweedie is a distribution for which Taylor’s (1961)TAYLOR LR. 1961. Aggregation, variance and the mean. Nature 189: 732-735. empirical law relating the mean to the variance, viz.,

Var[X]=a Exv[X]p,(1)
holds.

In this work, we are concerned with particular ranges of parameters of the Tweedie distribution, for which serious numerical difficulties arise when computing the probability density function (pdf) with form (c.f.Jørgensen (1992)JØRGENSEN B. 1992. The theory of exponential dispersion models and analysis of deviance. 2nd edition. Rio de Janeiro: Instituto de Matemática Pura e Aplicada., Section 2.7)

fZ(z;θ,α)=N(z;α)×exp[zθκ(θ,α)],z0,(2)
with
N(z;α)=1πz×k=1Γ(1+αk)Γ(1+k)(κ(1/z,α))ksin(kπα),(3)
and
κ(θ,α)=α1α(θα1)α.(4)
A more general form of the Tweedie pdf is fX(x)=(1/σ)fZ(x/σ), where σ is a scale parameter, for which
Exv[X]=σ κ (θ ),Var[X]=σ 2κ (θ ).
In this work, we set σ=1 throughout, and use Eq. (2) without loss of generality.

Clearly, Eqs. (2)–(4) define a two-parameter distribution. The ranges of the parameters that we are interested in are 0α1 and θ0. The exponent p in Eq. (1) and α are related by

α=p2p1.(5)

The central problem here is that for certain values of z and α the series given by Eq. (3) is exceedingly difficult, if not impossible, to sum employing “standard” double-precision (64-bit) floating-point arithmetic. A head-on approach to sum the series fails in many cases (Dunn and Smyth 2005)DUNN PK AND SMYTH GK. 2005. Series evaluation of tweedie exponential dispersion model densities. Stat Comp 15(4): 267-280.. To the best of our knowledge, currently the most successful approaches to sum Eq. (3) are either to employ Fourier inversion (Dunn and Smyth 2008)MALCOLM MA. 1971. On accurate floating-point summation. Communape ACMape 14(11): 731-736. or to exploit the link between the stable distribution and the Tweedie distribution (Dunn 2015)DUNN PK. 2015. Package ‘tweedie’ (Tweedie exponential family models), version 2.2.1. URL http://www.r-project.org/package=tweedie.
http://www.r-project.org/package=tweedie...
.

In spite of the aforementioned difficulties, it turns out that, for a fairly wide range of the variables α and z, accurate calculations are possible, at the price of using much higher precision and correspondingly much slower arithmetic. Unless stated otherwise, we use Python’s (Python Software Foundation 2015)PYTHON SOFTWARE FOUNDATION. 2015. Python Language Reference, version 3.5. URL https://docs.python.org/3/reference/index.html.
https://docs.python.org/3/reference/inde...
mpmath module (http://mpmath.org/) to perform all calculations with 1 000-bit precision for the mantissa (compare with the 53-bit precision of usual double-precision arithmetic). Being interpreted, Python is intrinsically much slower than compiled languages, and the use of mpmath makes our calculations even slower still. Yet for many practical purposes (for example, integrating fZ(z) with 1 000 points and Gaussian quadrature), our approach is entirely feasible on a desk computer with the rules devised here. It also has the advantage that the algorithms (and the underlying mathematics) are simple and straightforward, being therefore easy to implement and use.

In this work, we compare our computations of the Tweedie densities with the ones obtained using the tweedie package (Dunn 2015)DUNN PK. 2015. Package ‘tweedie’ (Tweedie exponential family models), version 2.2.1. URL http://www.r-project.org/package=tweedie.
http://www.r-project.org/package=tweedie...
for R (R Core Team 2015)R CORE TEAM. 2015. R: A Language and Environment for Statistical Computing .R Foundation for Statistical Computing. URL https://www.R-project.org/.
https://www.R-project.org/...
. There are four alternatives for computations using the tweedie package depending on the approach desired:

Summing the series: dtweedie.series (Dunn and Smyth 2005)DUNN PK AND SMYTH GK. 2005. Series evaluation of tweedie exponential dispersion model densities. Stat Comp 15(4): 267-280..

Fourier series inversion: dtweedie.inversion (Dunn and Smyth 2008)MALCOLM MA. 1971. On accurate floating-point summation. Communape ACMape 14(11): 731-736..

Using a stable distribution: dtweedie.stable (Nolan 1997)NOLAN JP. 1997. Numerical calculation of stable densities and distribution functions. Comm Statist Stochastic Models 13(4): 759–774.; it uses R’s library stabledist(Wuertz et al. 2016)B15 WUERTZ D, MAECHLER M AND RMETRICS CORE TEAM MEMBERS. 2016. Stabledist: Stable distribution functions. R package version 0.7-1. URL https://CRAN.R-project.org/package=stabledist.
https://CRAN.R-project.org/package=stab...
.

Using a saddlepoint approximation to the density: dtweedie.saddle (Dunn and Smyth 2001)DUNN PK AND SMYTH GK. 2001. Tweedie family densities: methods of evaluation. Odense, Denmark: In Proceedings of the 16th International Workshop on Statistical Modelling..

A generic call to dtweedie uses either dtweedie.series or dtweedie.inversion depending on the parameter values whereas dtweedie.stable and dtweedie.saddle must be called explicitly. Note that in the present work we never use the dtweedie.saddle function.

In spite of the mathematically more sophisticated – and usually effective, as we will see – approaches of Dunn and Smyth (2008)MALCOLM MA. 1971. On accurate floating-point summation. Communape ACMape 14(11): 731-736. and Nolan (1997)NOLAN JP. 1997. Numerical calculation of stable densities and distribution functions. Comm Statist Stochastic Models 13(4): 759–774., the nature of the problems involved in summing Eq. (3) remains not fully explored. Considerable insight as well as useful practical guidance for the calculation of Tweedie probability density functions can be gained by looking at them in more detail. Incidentally, this will also give us the opportunity to assess the implementations in the tweedie R package to calculate densities in some numerically challenging situations.

We have three objectives. First, we attempt to explain (empirically, by way of example) why it is in some cases so difficult to sum Eq. (3); often, it is in practice impossible to calculate it with double-precision (53-bit mantissa) arithmetic. As we shall see, the explanation involves difficulties in calculating individual terms of the series for large k when α is very close to 0 or to 1, and/or when z is small.

Then, we also show that some simple practical rules can be devised to avoid numerical errors or the need to sum too many terms in Eq. (3). It is fortunate that in many cases the numerical problems will arise for a range of z at which fZ(z) is exceedingly small. In such cases, instead of attempting to sum the series and keeping errors under control, it is much more effective to force the calculating function to return zero. As shown in Section Results using Python, this does not affect the accuracy of the calculation of definite integrals of fZ(z), and therefore of the probabilities. The criterion for returning zero in these cases is explained below in Section Practical rules and tests after Eq. (21).

Lastly, it will be possible to reassess the capabilities of the tweedie package, so that the practical rules devised here can also be useful for its users. The comparison with the high-precision approach using Python adopted in this work will be mutually beneficial for assessing both the tweedie R package and the nptweedie.py module developed for the present work.

In the next section we give a few examples of the numerical difficulties that arise in the summation of the series, without going into the details of how the terms are actually summed. This is left for Sections Absolute convergence, where we show that the series always converges (in an analytical sense), and Numerical implementation, where it is shown how a standard algorithm for the summation (with 1 000-bit precision) is enough in most cases. Then, with a “good enough” algorithm at hand, a practical rule is presented in Section Practical rules and tests to predict cases when, even with 1 000 bits, the sum will fail (in the numerical sense) to converge in less than 10000 terms. In this section, a first comparison with R’s tweedie package is also made.

As already mentioned, because failure of numerical convergence is associated with very small values of fZ(z), it is enough to return zero in such cases. This rule is then incorporated into a simple summation algorithm that is always successful (in this narrower sense) for 0.01α0.99. It is important to note that reliable computations of the pdf for the range of α considered here are important not only for problems characterized by parameter values in that range, but also when running algorithms for statistical inference either based on optimization or sampling methods such as Monte Carlo Markov Chain and similar ones which may have excursions off of those regions of the parameter space.

In Section Results using Python, several tests on the proposed algorithm are performed: the results are tested against an analytical closed form for fZ(z) when θ=1/2, α=1/2 which corresponds to the inverse Gaussian distribution; and the accuracy of the integrals of fZ(z) is checked by numerical integration. We will show that the proposed algorithm always integrates to 1 (given enough integration points) within an error of 106 (at most). In this section we also go back to calculating Eq. (3) with double-precision, obtaining additional insight at where and why standard double-precision implementations fail. Many of the tests performed in Section Results using Python are then run again in R, using the tweedie package, in Section Performance in comparison with R’s tweedie package: this provides an objective assessment of current R implementations for situations where straightforward summation is very difficult. Our conclusions are given in the final section.

Termwise behavior and the difficulty of computing N(z;α)

Let

B=|κ(1/z,α)|;(6)
it is worth writing
bk=Γ(1+αk)Γ(1+k)Bk,(7)
ck=(1)kbk,(8)
dk=cksin(kπα),(9)
and the corresponding series
Sb(k)=j=1kbj,(10)
Sc(k)=j=1kcj,(11)
Sd(k)=j=1kdj.(12)
While the series that we are ultimately interested in is N(z,α)=Sd()/(πz), considerable insight about the problems that plague the summation will be obtained by looking at the bk terms. Using these auxiliary series will also help to establish analytically the convergence of Sd(k) in the next section.

An upper bound for maxkbk can be found as follows (Dunn and Smyth 2005)DUNN PK AND SMYTH GK. 2005. Series evaluation of tweedie exponential dispersion model densities. Stat Comp 15(4): 267-280.:

kmax=z2pp2,(13)
bkmax=exp[(1α)kmax+12ln(α)].(14)
In the following, we show all plots normalized by the value of bkmax.

We now discuss the problems involved with the summation of Eq. (3) by means of 3 examples. In this section, we do not correct those problems yet, and the discussion that follows has the sole objective of shedding light on the nature of the numerical problems discovered by Dunn and Smyth (2005)DUNN PK AND SMYTH GK. 2005. Series evaluation of tweedie exponential dispersion model densities. Stat Comp 15(4): 267-280.. We proceed to sum the series almost straightforwardly, except that we use 1 000-bit precision arithmetic, and avoid loss of precision due to large variations in the order of magnitude of the terms in the sum. The numerical convergence criterion used is

ϵ=bk|Sd(k)|1010.(15)

Figures 1(a) - 1(c) show the behavior of bk, Sd(k), and ϵ. The parameters chosen for this example are α=+1/2, θ=1/2, because then the Tweedie coincides with an inverse Gaussian distribution, the only case in the range 0α1 for which an alternative analytical expression for fZ(z) is available. Figures 1(a) - 1(c) correspond to the cases (a) z=0.0006, (b) z=0.0008 and (c) z=0.0020.

Figure 1
Individual terms bk/bkmax (solid line), sum Sd(k)/bkmax of the N(z;α) series (filled circles), and relative error ϵ (dashed line) with α=1/2, θ=1/2. (a)z=0.0006, (b)z=0.0008 and (c)z=0.0020. Only (c) converged to the correct value with less than 10 000 sums.

The first striking feature in the figures is the range covered by the bk’s (shown as a solid line): in excess of 10300 in the first two cases, and 2.82×10216 in the last one. Indeed, the range is so wide that the plotting program used to produce the figures, which itself uses double-precision arithmetic, is unable to plot all the values calculated, thereby forcing the graphs to be clipped at the lower 10300 end.

To prevent that the summation of terms varying across this very wide range of orders of magnitude suffer from the “catastrophic loss of precision” mentioned by Malcolm (1971)MALCOLM MA. 1971. On accurate floating-point summation. Communape ACMape 14(11): 731-736., we implemented an adaptation of the msum function found in http://code.activestate.com/recipes/393090/ (Shewchuck 1996)SHEWCHUCK JR. 1996. Adaptive precision floating-point arithmetic and fast robust geometric predicates. Pittsburgh PA 15213: Technical Report CMU-CS-96-140, School of Computer Science, Carnegie Mellon University., which is designed to deal with this kind of problem. The adaptation, in the form of 3 functions in the module nsum (written by us), is presented in the Appendix. Therefore, in principle, the sum in Eq. (12) is being carried out without loss of precision. Yet, in cases (a) and (b), correct numerical convergence was not achieved for k10000 (although this is not readily apparent in the figures themselves): later on, we will find more evidence that loss of precision is not the critical issue with summing Eq. (3). Rather, the lack of convergence to the right value can happen due to the following reasons:

  1. Although the sum itself is carried out without loss of precision, apparently inaccurate evaluation of the terms dk for large values of k leads the algorithm to “converge” to a wrong value.

  2. The algorithm reaches the stopping criterion k=10000 without achieving the convergence criterion specified in Eq. (15).

Case (a) is perhaps the worst of all: the algorithm’s convergence criterion given by Eq. (15) is satisfied, but the value obtained for the density is wrong and positive (reason 1 above). In this particular case, because we have the alternative to calculate the density with the inverse Gaussian formula, this is easily verifiable; for α1/2, however, there seems to be no way to check within the algorithm itself if reason 1 is happening. In the case of Figure 1(a), the algorithm (employing 1 000 bits of precision!) calculated a density of 2.735443272×10+68, whereas the correct value given by the inverse Gaussian formula is 9.03190022×10358.

Case (b)’s failure is another instance of reason 1: here, the algorithm found a negative value of Sd. This case is somewhat easier to handle, because the user can verify the implausibility of a negative density with a simple if clause. Note that the negative values of Sd(k)/bkmax to which the algorithm converges do not appear in the log-log plot.

In short, neither case (a) nor case (b) failed to converge because the stopping criterion k=10000 was reached: they either converged to a wrong positive value (a) or to a wrong negative value (b), in spite of being calculated with 1000 bits of precision and the use of a summation algorithm specifically designed to avoid loss of precision.

Finally, case (c) is successful. The range of the bk terms is somewhat smaller (albeit still daunting). Convergence to the right value of 3.2329931463×10105, confirmed by the inverse Gaussian formula, is only achieved at k=1826.

In all cases, the relative error (shown as a dashed line, with the corresponding axis scale on the right) takes a long time to reach the convergence criterion ϵ1010. Close to spurious (cases (a) and (b)) or correct (case (c)) convergence, ϵ plunges down abruptly, there being no qualitative difference between the three cases that could help to devise an identification pattern within the algorithm for spurious convergence.

Finally, Figure 1 shows a last disconcerting feature of Eq. (3): because of the alternating character of Eq. (11), compounded by the sin(kπα) term in Eq. (12), successive values of dk tend to almost cancel out each other (in order of magnitude), the residue being carried over to the next term. For this reason, the order of magnitude of Sd(k) (given by the points in the figure) and of bk (and therefore of ck and dk as well) remains the same for a very long time, while the magnitude of bk varies non-monotonically with k over a very wide range of values (the reader should be aware that the points and the solid line in Figure 1 are not exactly equal, the visual impression being an artifact of the very wide range in the plot).

All that adds to the difficulty of summing the series, casting serious doubts on the feasibility of a direct attack, as found out by (Dunn and Smyth 2005)DUNN PK AND SMYTH GK. 2005. Series evaluation of tweedie exponential dispersion model densities. Stat Comp 15(4): 267-280. and Dunn and Smyth (2008)MALCOLM MA. 1971. On accurate floating-point summation. Communape ACMape 14(11): 731-736..

If one accepts the use of very large floating-point precision as a useful, and perhaps unavoidable, tool, however, there is hope in practical situations, although definitely not for all possible combinations of z and α. In the next section, we discuss the analytical convergence of the series. In Section Numerical implementation, we give the details of how the series summations, in this section, and elsewhere in the manuscript, were implemented. In Section Practical rules and tests, we give a practical procedure to sum the series using 1 000-bit precision that avoids the errors observed, for example, in cases (a) and (b) of this section. The procedure is able to predict when the density fZ(z) will be very close to zero, and in this case will return zero instead of attempting the sum. Unavoidably, this results in a relative error of 100% for the null estimate of fZ(z), but has no practical consequences insofar as integration of the density and calculation of probabilities is concerned.

Absolute convergence

The series Sd(k) is absolutely convergent. The proof is based on two theorems:

  1. If a series is absolutely convergent, then it is convergent (Dettman (1984)DETTMAN JW. 1984. Applied Complex Variables. New York: Dover Publications., Theorem 4.3.4). We shall show that Sb(k) converges. Therefore, Sc(k) is absolutely convergent (bk=|ck|), and Sc(k) converges.

  2. If a series is bounded term-by-term by a convergent series of positive terms, the series is absolutely convergent (Dettman (1984)DETTMAN JW. 1984. Applied Complex Variables. New York: Dover Publications., Theorem 4.3.12). Since |dk|bk and the latter is convergent, Sd(k) is absolutely convergent, and therefore it is convergent.

To prove that Sb(k) is convergent, we use the ratio test:

bk=BkΓ(1+αk)k!bk+1bk=Bk+1(k+1)!k!BkΓ(1+α(k+1))Γ(1+αk)=Bk+1Γ(1+α(k+1))Γ(1+αk).
Using Stirling’s approximation for large x,
Γ(x+1)2πx(xe)x,
we get
limkbk+1bk=limkBk+12π(α(k+1))2παk(α(k+1)e)α(k+1)(αke)αk=limkBk+1eααk+ααk×[α(k+1)]α(k+1)(αk)αk.
Note that the square-root term has limit 1 as k. Also,
[αk+α]αk+α[αk]αk=[αk+ααk]αk[αk+α]α=[1+1k]αkeα[αk+α]α
We obtain, finally:
limkbk+1bk=Bk+1eαeααα(k+1)α=Bαα(k+1)α10,(16)
for 0α1. This means that for any α in this range, and for any finite B, the series Sb(k) is convergent. From the two theorems above, therefore, both Sc(k) and Sd(k) are absolutely convergent.

Numerical implementation

We use a very simple scheme to calculate Sd(k). The bk terms are calculated with

bk=exp[ln[Γ(1+αk)Γ(1+k)Bk]]=exp[lnΓ(1+αk)+kln(B)lnΓ(1+k)](17)
to take advantage of the function loggamma in mpmath. Then ck and dk are calculated straightforwardly with Eqs. (8)–(9). As mentioned above, the sum of the dk terms is performed with the help of a modification of the publicly available function msum: see module nsum in the Appendix.

The summation stops when the criterion given in Eq. (15) is reached. If Sd(k) is found to be negative, k is incremented and the algorithm continues. Usually, this forces the algorithm to reach k=10000, at which point it flags non-convergence.

Practical rules and tests

As mentioned in the Introduction, the algorithm as is will often fail to converge to the right numerical value with less than 10 000 terms in the sum. In order to make it more robust, and to curtail unnecessary calculations when fZ(z) is actually very small, we observe that B defined in Eq. (6) is a good predictor of the “difficulty” of the summation. This observation is found in the caption of Table I of Dunn and Smyth (2008)MALCOLM MA. 1971. On accurate floating-point summation. Communape ACMape 14(11): 731-736. about a variable they call ξ, which is given by (in our notation, and setting σ=1)

ξ=1z2p.(18)
For comparison, our
B=|12p(p1z)2p1p|(19)
is not equal to ξ, but it is related to it.

The role of B in the rate of convergence is seen in Eq. (16). For a specified (desired) rate of convergence η after n terms, one has

Bαα(n+1)α1η,1B>Φ(α)αα(n+1)α1η.(20)
If n and η are fixed, Eq. (20) provides a practical criterion to predict that the algorithm will be successful: this will happen whenever 1/B is larger than Φ(α). This of course may be over-optimistic: B is also a function of z, and there is no a priori guarantee that n and, most importantly, η, are independent of z and of α.

On the bright side, Eq. (20) is independent of θ: only α appears in it. The idea therefore is to vary α over its allowed range of 0α1, and for each α to vary z from below until the algorithm starts to converge to the “correct” value of fZ(z). For a single z it may be difficult to identify if the convergence is to the true value, but a synoptic view of several values of z in a graph is quite enough to identify the correct values.

For a fixed α, therefore, it is possible to determine, by trial and error, the values of z and B at which the algorithm converges numerically with less than 10 000 terms.

To pursue this idea, we (manually) searched for the z-range where the algorithm converges using the values α=0.01, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, and 0.99. For α0.01 or α>0.99, manually trying to identify a range of z’s where the algorithm converges proved unsuccessful. We consider that the interval α[0.01,0.99] is wide enough in practice; therefore, if α falls outside of it, we simply do not attempt to sum the series. The value of θ=1/2 (the same as for the inverse Gaussian) was used.

The graphs obtained with 6 of the 11 values of α above are shown in Figure 2. Each sub-figure shows the calculated densities fZ(z) (with 1 000-bit precision) as well as 1/B for a range of z values around which the algorithm starts to be successful. Non-convergence in 10 000 iterations (or convergence to a negative value) is flagged as 10+150. Also shown is the value of the density calculated by the tweedie package with the “generic” call to dtweedie (see the comment in the introduction about how it chooses which method to employ). In contrast to the 1 000-bit calculation of fZ(z) by series summation, usually dtweedie does not suffer from spurious values at the left tail of the distribution, but for α=0.9 it clearly does not return correct values there, and in fact high-precision series summation fares a little better (Figure 2, lower right panel). We observed the same problem for our maximum “acceptable” α, 0.99 (not shown). We did not try to determine a more precise value of α, above 0.8, where dtweedie starts to give incorrect values at the left tail.

Wrong convergence to positive values simply shows as discernible spurious values. Similar behavior is observed for the other values of α mentioned above. Note the monotonically increasing 1/B in all cases.

Figure 2
Transition from unsuccessful to successful calculation of Tweedie densities, for α= 0.4, 0.5, 0.6, 0.7, 0.8 and 0.9. On the left axis scale, the solid line shows the density calculated from summing the series with 1 000-bit precision, and the grey circles are the densities calculated by dtweedie in R. On the right axis scale, the dashed line is 1/B. The arrows indicate the first point at which the series summation is considered “successful”. In all cases, θ=1/2.

In Figure 2, the very small values of fZ(z) for the first successful value of z are noteworthy. Table I lists α, z, fZ(z) and 1/B for the first successful instance of the algorithm (subjectively decided) in each case.

TABLE I
Early values, in the direction of increasing z, of α, z, fZ (z) and 1/B for which the Tweedie densities are calculated correctly.

With the values in Table I, it is possible to plot 1/Bversusα, in Figure 3. We can now draw a smooth curve through the data points. This can be done by adjusting a suitable equation using (in our case) the available non-linear Levenberg-Marquardt least squares procedure built-in in the plotting program (Gnuplot: www.gnuplot.info). We used a weighted version, giving weight 1 to all the ten lowest values of α, and weight 100 to the largest (0.99). This is necessary because, as α gets closer to 1, the pdf becomes more and more pointed close to zero, and small changes in z and in 1/B make all the difference between numerical non-convergence and convergence. Thus, an accurate value of the minimal 1/B is much more critical for α=0.99.

Our first try was to use Eq. (20) and to adjust n and η as parameters of the least-squares fit. Although Eq. (20) is (encouragingly) able to capture the overall α-dependency, the result, shown in Figure 3 as a dashed line, is relatively poor. Therefore, we changed to a purely empirical equation with 3 instead of 2 adjustable paramters, viz.

Ψ(α)=exp[a+bαc].(21)
This is shown as a solid line in Figure 3. Not surprisingly, the fit is better. The adjusted parameters are directly coded in module Psialpha.py, given in the Appendix. It is very simple, and just defines the values of a, b and c in Eq. (21) above.

Figure 3
Early values of 1/B for which the Tweedie densities are calculated correctly, in the direction of increasing z, for each α, and adjusted curves Φ(α) (dashed line) and Ψ(α) (solid line).

With Ψ(α) thus obtained, there are only two changes that need to be made to the algorithm of calculating Tweedie densities:

  1. Do not allow α0.01 or α>0.99; in these cases we do not attempt to sum the series, and the calculating function aborts. As discussed above, the range [0.01,0.99] is probably wide enough for most applications.

  2. If 1/BΨ(α), return 0; otherwise, sum using 1 000-bit precision.

The full implementation is given in the Appendix. The probability density function is called pdfz_tweedie. It is designed to receive and to return standard floating-point variables: all the multiple-precision work is done internally and kept hidden from the user. In this way, he or she can use the ordinary Python float type transparently while using the module nptweedie.

Results using Python

A first comparison is to check the proposed algorithm against the densities calculated with the inverse Gaussian distribution. This is only possible for α=1/2, θ=1/2, and is done visually in Figure 4. The agreement is actually very good: for 1 000 points evenly distributed between 0 and 20 the two methods agree exactly up to the eighth decimal place.

Figure 4
Comparison between the proposed algorithm and the inverse Gaussian distribution, using α=1/2, θ=1/2.

For cases when an analytical benchmark is not available, a good alternative is to check (numerically) how close the integral I=0fZ(z)dz is to 1.

At this point, also, one raises the question of how much the 1 000-bit precision in use is overkill: to what extent would double-precision suffice? Moreover, how important is it to have a sophisticated handling of possible loss of precision in the summation?

In order to answer these questions, we re-implemented the pdfz_tweedie function using double-precision only, while keeping most of the remainder of the algorithm intact. Two versions of the pure double-precision version were implemented: the first version uses the nsum module and does the sum without loss of precision (“no loss”); it is called here “dp-nl”; the other version does not use the nsum package, and performs all the sums with standard arithmetic only (“with loss”) ; it is called here “dp-wl”. For easy reference, we call the 1 000-bit precision version “np-nl”. Note that “dp-nl”, “dp-wl” and “np-nl” are the mnemonic names of the versions for summing the series, and not actual function names.

We now show the results of integrating the density numerically for these three versions, for all values of α considered above, keeping θ=1/2 fixed. Except for α=0.99, all numerical integrals were calculated for a range z[1.0×106,50] with 1 000 points using Gauss-Legendre quadrature (we used a translation to Python of the routine gauleg in Numerical Recipes (Press et al. 1992)PRESS WH, TEUKOLSKY SA, VETTERLING WT AND FLANNERY BP. 1992. Numerical Recipes in C; The Art of Scientific Computing. 2nd edition. New York, NY, USA: Cambridge University Press.. For α=0.99, the integration was for z[1.0×106,20] with 10 000 points. The adjustment of range and number of points used is relatively easy to do, and somewhat unavoidable.

Results are shown in Table II: for each α, the rows show the number of points used in the quadrature, and the value of the numerical integral I and its absolute difference from 1, δI, for each of “np-nl”, “dp-nl” and “dp-wl”.

TABLE II
Numerical integration of Tweedie densities using Gaussian quadrature for versions np-nl, dp-nl and dp-wl of bit precision / summation method: I=fZ(z)dz (the integral calculated with each version) and δI = I 1, with θ = 1/2.

Clearly, with 1 000-bit precision, we are able to calculate very accurate integrals and, consequently, probabilities, for all values of α in the range.

The double precision versions, on the other hand, are at first sight disappointing. Except for α=0.5 and α=0.99, these versions are unable to produce accurate values of I. On the other hand, they fail or suceed simultaneously and produce essentially the same results. This is an indication that “catastrophic loss of precision” is not an issue, because dp-nl avoids loss of precision, while dp-wl does not. In hindsight, this is probably because the order of magnitude of the bk terms varies slowly with k, so that each new term added is close, in order of magnitude, to the previous one and to the partial sum as well (c.f. Figure 1).

Moreover, in practice the double precision versions fare better than suggested by Table II. This can be verified in Figure 5, where we show the pointwise numerical convergence (or not) of the double precision version “dp-nl” (gray dots) against the 1 000-bit precision version “np-nl” (solid black lines). The behavior of “dp-wl” is essentially the same as that of “dp-nl”, and is not shown. In the figure, non-convergence of the sum for 10 000 terms (or spurious convergence for values greater than 10) is flagged as fZ(z)=10 for easy identification. Negative spurious values are simply not shown on the log-log plot.

Figure 5
Detailed verification of the convergence behavior of versions “np-nl” (continuous black line), and “dp-nl” (gray dots, with “dp-wl” being the same as “dp-nl”): (a), α=0.01; (b), α=0.2; (c), α=0.4; (d), α=0.64; (e), α=0.8 and (f), α=0.99.

One can see that the same problems that afflicted the 1 000-bit version are present, except that with more gravity, in the double-precision version. Thus, for small values of z, the simple algorithm used here fails, except that these z values are now greater than they were with the 1 000-bit version. Not surprisingly, the filter of Eq. (21), which was designed and calibrated with the 1 000-bit version, “lets through” a few points on the left tail of the distribution that lead to non-convergence.

Of course, one can re-calibrate the parameters in Eq. (21) to censor those points, but especially for the smaller α’s, this may result in “chopping off” two much of the left tail (in the case α=0.01, the whole pdf is lost with double precision).

Performance in comparison with R’s tweedie package

We now repeat the analysis in the previous section except that, instead of running np-nl against its double-precision versions, we run it against two options available in R: dtweedie, and dtweedie.stable. The results are shown in Table III. In this table, columns 3 and 4 are repeated from Table II, for ease of comparison. dtweedie.stable now fails for α=0.01 (it was not tried to check at exactly what value of α0.1 it fails for the first time), whereas, at the other extreme, dtweedie fails for α=0.9 and α=0.99 (it was not tried to check at exactly what value of α>0.8 it fails for the first time).

TABLE III
Numerical integration of Tweedie densities using Gaussian quadrature for np-nl (1000- bit precision), dtweedie and dtweedie.stable: I=fZ(z)dz the integral calculated with each version) and δI = I 1, with θ = 1/2.

The same parameters of Gaussian quadrature previously employed to generate Table II were used. Figure 6 shows the corresponding densities for six cases: in the figure, failures are flagged at constant values equal to 1.0×106.

Figure 6
Detailed verification of the convergence behavior of 1 000-bit densities “np-nl” (continuous gray line), in comparison with dtweedie (open squares) and dtweedie.stable (filled circles): (a), α=0.01; (b), α=0.2; (c), α=0.4; (d), α=0.64; (e), α=0.8 and (f), α=0.99. In (a), dtweedie.stable values lower than 1.0×106 are shown at this value. In (e), dtweedie values lower than 1.0×106 are shown at this value.

There is always at least one function in package tweedie that performs as well, in practice, as our high-precision implementation in Python, but the choice must be made by the user. In its interval of validity, 0.01α0.99, nptweedie always succeeds and calculates the integral of the density with an accuracy better than 1×106.

Conclusions

In this work we have investigated empirically a few of the problems that afflict the calculation of Tweedie probability density functions for the range 0α1 of one of its parameters. The α parameter is identified as the one whose values affect the summing of the corresponding series, the θ parameter playing no role in the problem. We have also shown that the series converges in the analytical sense, in spite of the occurrence of severe numerical problems for its summation for some values of z and α.

The terms in the series span a very large range of orders of magnitude, close to that allowed by double precision. This wide range of values, however, turns out not to lead to catastrophic loss of precision, likely because the order of magnitude varies slowly in the summation (nearby terms have close to the same orders of magnitude).

Still, for very large indices k in the summation, the calculation of the individual term remains inaccurate, leading in some cases to either the inability of reaching a stable value even after 10 000 terms, or to convergence to spurious values. A key parameter that allows to predict this event, here called 1/B, was identified and used to construct an ad-hoc procedure to avoid it.

We have visually identified the thresholds in z and the corresponding 1/B values as a function of α. For a practical range of α[0.01,0.99], this allows the Tweedie pdfs to be calculated successfully using 1 000-bit precision.

Attempts to use only double precision fail to various degrees, with the worst cases being α close to the lower limit above (and vice-versa). In specific cases, the procedures adopted here may be adapted for double precision, if one is willing to accept a shorter range for α than [0.01,0.99]. One of the three functions available in R’s tweedie package that we tested is always able to deal with the numerical problems satisfactorily, but the specific function varies, and its name may need to be enforced by the user: the results presented here may be useful in this regard.

ACKNOWLEDGMENTS

The authors wish to thank the comments and suggestions by two anonymous reviewers, which contributed substantially to improve the manuscript.

REFERENCES

  • DETTMAN JW. 1984. Applied Complex Variables. New York: Dover Publications.
  • DUNN PK. 2015. Package ‘tweedie’ (Tweedie exponential family models), version 2.2.1. URL http://www.r-project.org/package=tweedie
    » http://www.r-project.org/package=tweedie
  • DUNN PK AND SMYTH GK. 2001. Tweedie family densities: methods of evaluation. Odense, Denmark: In Proceedings of the 16th International Workshop on Statistical Modelling.
  • DUNN PK AND SMYTH GK. 2005. Series evaluation of tweedie exponential dispersion model densities. Stat Comp 15(4): 267-280.
  • DUNN PK AND SMYTH GK. 2008. Evaluation of tweedie exponential dispersion model densities by Fourier inversion. Stat Comp 18(1): 73-86.
  • JØRGENSEN B. 1987. Exponential dispersion models. J Roy Stat Soc B Met 49(2): 127-162.
  • JØRGENSEN B. 1992. The theory of exponential dispersion models and analysis of deviance. 2nd edition. Rio de Janeiro: Instituto de Matemática Pura e Aplicada.
  • MALCOLM MA. 1971. On accurate floating-point summation. Communape ACMape 14(11): 731-736.
  • NOLAN JP. 1997. Numerical calculation of stable densities and distribution functions. Comm Statist Stochastic Models 13(4): 759–774.
  • PRESS WH, TEUKOLSKY SA, VETTERLING WT AND FLANNERY BP. 1992. Numerical Recipes in C; The Art of Scientific Computing. 2nd edition. New York, NY, USA: Cambridge University Press.
  • PYTHON SOFTWARE FOUNDATION. 2015. Python Language Reference, version 3.5. URL https://docs.python.org/3/reference/index.html
    » https://docs.python.org/3/reference/index.html
  • R CORE TEAM. 2015. R: A Language and Environment for Statistical Computing .R Foundation for Statistical Computing. URL https://www.R-project.org/
    » https://www.R-project.org/
  • SHEWCHUCK JR. 1996. Adaptive precision floating-point arithmetic and fast robust geometric predicates. Pittsburgh PA 15213: Technical Report CMU-CS-96-140, School of Computer Science, Carnegie Mellon University.
  • TAYLOR LR. 1961. Aggregation, variance and the mean. Nature 189: 732-735.
  • B15
    WUERTZ D, MAECHLER M AND RMETRICS CORE TEAM MEMBERS. 2016. Stabledist: Stable distribution functions. R package version 0.7-1. URL https://CRAN.R-project.org/package=stabledist.
    » https://CRAN.R-project.org/package=stabledist

Appendix Python implementation

For completeness, we give here the full Python implementation of 3 modules (nsum, Psialpha and nptweedie). The following observations may be helpful for non-Python programmers:

  • In Python, there is no end, endif, end do, etc., statement. The scope of if statements or loops is defined by indentation. To close long stretches of if statements or loops, we use the Python keyword pass, which does nothing.

  • Python modules are libraries, and functions in a module can be imported into other programs, in a fashion similar to MODULA-2, ADA and R.

  • Although Python functions are much more flexibile, the functions in the present modules are similar to C functions (returning a single value, or none) and they can be used on the right side of an assignment in exactly the same way.

  • Calling programs do not need to manage the inner workings of the modules. In particular, there is no need to understand what a dictionary is in nsum (just use the recipe) or multiple precision in nptweedie (just use ordinary floating point; all conversions necessary are performed on the input and output values).

nsum.py

The adaptation of the function msum found in http://code.activestate.com/recipes/393090/, which is what is actually used to perform the sums, consists of a small Python module nsum with three functions: startsum, sumthis, and sumall. startsum creates a new entry in a internal dictionary called partials. A Python dictionary is like an associative array: instead of an integer, it is indexed by an entry; in our case, the name of the sum in a string. This allows the flexibility of keeping tab of several partial sums in parallel. Although this feature was not actually used in our implementation for summing the series (see module nptweedie in this appendix), it is useful, for example, for debugging.

Although not necessary, we adopt the convention of using strings as dictionary keys with the same name as the floating-point variable that ultimately stores the sum.

It is easy to understand how to use the functions in nsum.py, and this is all that is needed to understand how sums are calculated in the implementation nptweedie.py given below: startsum(’sumd’) starts a sum that will ultimately be stored in variable sumd; sumthis(dk,’sumd’) adds a new term dk, without actually calculating the sum, and sumd = sumall(’sumd’) calculates the partial sum. Note that, because the Python dictionary is internal to nsum, knowledge of how Python dictionaries work, although useful, is not strictly necessary.

Here is the full implementation of nsum.py:

Psialpha.py

nptweedie.py

This module implements the summation of the Tweedie series using 1 000-bit precision. The most useful function in the module is pdfz_tweedie(z,theta,alpha); its use is self-explanatory.

Publication Dates

  • Publication in this collection
    02 Dec 2019
  • Date of issue
    2019

History

  • Received
    20 Mar 2018
  • Accepted
    4 Dec 2018
Academia Brasileira de Ciências Rua Anfilófio de Carvalho, 29, 3º andar, 20030-060 Rio de Janeiro RJ Brasil, Tel: +55 21 3907-8100 - Rio de Janeiro - RJ - Brazil
E-mail: aabc@abc.org.br