Rational Approximations for the Modified Bessel Function of the First Kind – I0(x) for Computations with Double Precision

by Pavel Holoborodko on November 11, 2015

In this post we will study properties of rational approximations for modified Bessel function of the first kind I_0(x) commonly used to compute the function values in double precision for real arguments. We compare the actual accuracy of each method, discuss ways for improvement and propose new approximation which delivers the lowest relative error.

BesselI Functions (1st Kind, n=0,1,2,3)

We consider following rational approximations: Blair & Edwards[1], Numerical Recipes[2], SPECFUN 2.5[3], and Boost 1.59.0[4]. All libraries use approximation of the same form, proposed in the work of Blair & Edwards[1]:

(1)    \begin{alignat*}{2} I_0(x)&\approx P_n(x^2)/Q_m(x^2),              \qquad &&|x| &\le 15.0 \\ x^{1/2} e^{-x} I_0(x)&\approx P_r(1/x)/Q_s(1/x), \qquad &&  x &\ge 15.0 \end{alignat*}

where P_*(x) and Q_*(x) are polynomials of corresponding degrees.


Boost and SPECFUN use modifications of original pair of polynomials provided in Tables 19 & 51 of the Blair & Edwards report. Numerical Recipes relies on different approximation of lower degree, but claim it to be suitable for double precision.

All of the approximations are capable of providing relative error down to 10-23 when computed using extended precision without rounding and cancellation effects.

However, in practical situation, when approximations are evaluated in double precision, such level of accuracy cannot be reached. In general, when we deal with limited precision floating-point arithmetic – there is no point to push theoretical accuracy of approximation beyond precision we have on hands. Instead, conditioning of the approximation play the most important role here (its robustness to cancellation and rounding errors).

Below we present actual accuracy of each approximation in terms of machine epsilon, eps = 2-52 ˜ 2.22e-16.

All approximations are evaluated by the same C++ code for Horner scheme (without FMA) and compiled by Intel Compiler with /fp:strict switch to avoid compiler’s speculative optimizations for floating-point operations. Relative accuracy is computed against true values of I_0(x) evaluated in quadruple precision by Maple.

Blair & Edwards[1]


Numerical Recipes, 3rd edition, 2007[2]


SPECFUN 2.5[3]


Boost 1.59.0[4]

GNU GSL 2.0[5]
(GNU GSL library uses Chebyshev expansion)

NAG Toolbox for MATLAB[7]
(Most likely NAG library uses Chebyshev expansion)

NAG routine has serious accuracy degradation for x>12. Accuracy plot for the range:

MATLAB 2015a[6]
(Shape of the scatter plot of relative error indicates that most probably MATLAB uses Chebyshev expansion)

Red lines on the plots show limits of approximation error we naturally expect from double precision computations.

Peak relative error in terms of machine epsilon – I_0

Method/Softwarex\in[0,15)x \in[15,\infty)
Minimax Rational Approximations:
Blair & Edwards 4.67 2.68
Numerical Recipes, 2007 5.57 3.91
SPECFUN 2.5 4.67 7.02
Boost 1.59.0 5.78 2.68
Refined – (3), (4) 2.38 0.72
Chebyshev Expansions:
GNU GSL 2.0 2.37 3.25
NAG Mark 24 5.49 256.6
MATLAB 2015a 3.19 5.43

computed over 50K uniformly distributed random samples in each interval

Commonly used rational approximations are not able to provide relative accuracy close to machine epsilon. Even the GNU GSL and MATLAB which are based on Chebyshev series expansions and potentially more accurate, still cannot reach target level of accuracy (most probably last digits of Chebyshev coefficients are not correctly rounded in their implementations).

Refined Approximations

If we look at the plot of I_0(x) and its series expansion, we notice that curve resembles strong x^{2k} behavior near zero. This gives us idea to include first terms of the power series into approximation form and use following formulation:

(2)    \begin{alignat*}{2} I_0(x)&\approx 1+\left[\frac{x}{2}\right]^2\frac{P_n([x/2]^2)}{Q_m([x/2]^2)},         \qquad &&|x| &\le A \\ x^{1/2} e^{-x}I_0(x)&\approx  P_r(1/x)/Q_s(1/x), \qquad &&  x &\ge A \end{alignat*}

We run rigorous search to find optimal approximations among all polynomials n+m,\,r+s\le30 and split point A\le15. The search criteria includes not only the optimal theoretical approximation order but also minimization of relative accuracy in double precision. In other words, we take into account conditioning of the resulting polynomials in double precision.

Search revealed approximations with following properties:

Double precision

The split point A=7.75 is chosen so that sum of the errors is minimal among other possible solutions. The peak relative error on [0, 7.75) is 2.38eps, and 0.72eps on [7.75,\infty).

Final expressions for optimal approximations:

  • For small arguments, x<7.75:

    (3)    \begin{equation*} I_0(x) \approx [x/2]^2P_{16}([x/2]^2)+1 \end{equation*}

  • For large arguments, x\ge7.75:

    (4)   \begin{equation*} x^{1/2} e^{-x} I_0(x) \approx P_{22}(1/x) \end{equation*}

Coefficients of the polynomials (accurate to 25 digits):

nP_{16}(x) for [0, 7.75)
01.0000000000000000000000801e+00
12.4999999999999999999629693e-01
22.7777777777777777805664954e-02
31.7361111111111110294015271e-03
46.9444444444444568581891535e-05
51.9290123456788994104574754e-06
63.9367598891475388547279760e-08
76.1511873265092916275099070e-10
87.5940584360755226536109511e-12
97.5940582595094190098755663e-14
106.2760839879536225394314453e-16
114.3583591008893599099577755e-18
122.5791926805873898803749321e-20
131.3141332422663039834197910e-22
145.9203280572170548134753422e-25
152.0732014503197852176921968e-27
161.1497640034400735733456400e-29
n P_{22}(x) for [7.75,\infty)
0 3.9894228040143265335649948e-01
1 4.9867785050353992900698488e-02
2 2.8050628884163787533196746e-02
3 2.9219501690198775910219311e-02
4 4.4718622769244715693031735e-02
5 9.4085204199017869159183831e-02
6-1.0699095472110916094973951e-01
7 2.2725199603010833194037016e+01
8-1.0026890180180668595066918e+03
9 3.1275740782277570164423916e+04
10-5.9355022509673600842060002e+05
11 2.6092888649549172879282592e+06
12 2.3518420447411254516178388e+08
13-8.9270060370015930749184222e+09
14 1.8592340458074104721496236e+11
15-2.6632742974569782078420204e+12
16 2.7752144774934763122129261e+13
17-2.1323049786724612220362154e+14
18 1.1989242681178569338129044e+15
19-4.8049082153027457378879746e+15
20 1.3012646806421079076251950e+16
21-2.1363029690365351606041265e+16
22 1.6069467093441596329340754e+16

text

Interestingly that optimal approximations are pure polynomials.

We believe the reason is that power series of I_0(x) (which can be considered as polynomial) converges very fast and in general delivers good accuracy with small number of terms. Minimax approximations just try to fit the series and naturally denominator becomes Q_0(x)=1. Same happens for large arguments, where asymptotic series gives high accuracy quickly and our approximation just tend to reproduce its coefficients.

Overall, minimax approximations (3), (4) can be considered as “accelerated” power and asymptotic series of I_0(x) tailored for double precision computations.

Quadruple precision

The split point is A = 15.5. The peak relative error on [0, 15.5) is 4.6qeps, and 3qeps on [15.5,\infty). Machine epsilon for quadruple precision qeps = 2-112 ˜ 1.93e-34.

Final expressions for optimal approximations:

  • For small arguments, x<15.5:

    (5)    \begin{equation*} I_0(x) \approx [x/2]^2P_{32}([x/2]^2)+1 \end{equation*}

  • For large arguments, x\ge15.5:

    (6)   \begin{equation*} x^{1/2} e^{-x} I_0(x) \approx P_{45}(1/x)/Q_{3}(1/x) \end{equation*}

Coefficients of the polynomials (correctly rounded to 35 digits):

nP_{32}(x) for [0, 15.5)
01.00000000000000000000000000000000000e+00
12.50000000000000000000000000000000000e-01
22.77777777777777777777777777777777779e-02
31.73611111111111111111111111111111083e-03
46.94444444444444444444444444444449108e-05
51.92901234567901234567901234567853905e-06
63.93675988914084152179390274631572236e-08
76.15118732678256487780297303953307469e-10
87.59405842812662330592959640146193002e-12
97.59405842812662330592959486994771411e-14
106.27608134555919281481787961562157257e-16
114.35838982330499501028963901609513258e-18
122.57892888952958284633290216759752759e-20
131.31578004567835859498063743807352147e-22
145.84791131412603820814035808275571168e-25
152.28434035708048361004361404800451548e-27
167.90429189301205834423041758728181758e-30
172.43959626327530221779537803521876900e-32
186.75788438580533320595854894826703327e-35
191.68947109644654239602153081596915483e-37
203.83100021886675733871724163689480203e-40
217.91528970342866904594413272561413611e-43
221.49627405863018698139365798878928887e-45
232.59769774665516078740718557214698854e-48
244.15632134882805917464119034520347550e-51
256.14832848022015178960667680429552612e-54
268.43488878141065428604958554806851059e-57
271.07486432616861166276174172563459685e-59
281.28666328289305198822024236233461582e-62
291.37245729295774874628038022579554422e-65
301.71642649655915754818217188454585521e-68
316.42133551177881248636744515323368258e-72
322.93990539325117515154720726204799338e-74
nP_{45}(x) for [15.5, \infty)
03.33094095073755755465308169515679754e-01
1-4.87305575020532173309650536123692992e+00
21.16765891025166570372673307538545807e+00
32.64107028015824651013530450189957215e+03
43.29947497422711739169516563125826777e+02
51.85360637074230573336349486640079127e+02
61.92716946160432511352351625380961688e+02
72.94374157084426440123892033050844719e+02
85.94369974672125230816192833403509948e+02
91.49338188597066557028520330783876602e+03
104.49070212230053277828526595070904928e+03
111.57364639902573996182985842231707641e+04
126.09475654654342361774769290134171804e+04
136.74843325301108470040298999048635726e+05
14-6.57083582192662169318674677284026011e+07
159.83172447953771800483159480192563859e+09
16-1.23905750827828029147738416146512108e+12
171.35560883142284024977500631878243972e+14
18-1.29343718391927120913858066893326175e+16
191.08130315540350608880516332890283373e+18
20-7.95087530860069987411846664771880366e+19
215.15861021545066736428201155306431096e+21
22-2.96090425993185982903689108288073850e+23
231.50651836855858836083071762402341231e+25
24-6.80526037590684842951323289698851799e+26
252.73203132967909651102779478756247727e+28
26-9.75309923194275182518137234542338189e+29
273.09640722254712498146426101828630546e+31
28-8.73913695514222641997435352487537101e+32
292.19076945578257225990373580600588103e+34
30-4.87120899281572328851130765475319009e+35
319.58810466892032483163229870526807875e+36
32-1.66627420459640895942341335296132587e+38
332.54807715779107313134088724774425453e+39
34-3.41409756541742494848545251263381364e+40
353.98660860345465130692055711924224814e+41
36-4.02958459088507064758116007151695612e+42
373.49565245578103887077950778697885925e+43
38-2.57418585602707614088803903097321313e+44
391.58622500396250248773886354120038387e+45
40-8.02304482873909934532998282613480285e+45
413.24266546042741906420052016688464442e+46
42-1.00665425805861133618962888863945150e+47
432.25310991683789163051367847410612888e+47
44-3.23574657445200742953438267415045330e+47
452.23881303334352384470482998051927380e+47
nQ_{3}(x) for [15.5, \infty)
08.34943076824502833362836053862403964e-01
1-1.23193072119209042422963284443456931e+01
24.40809330596253724239279903622218481e+00
36.62043547609998683454530303968680120e+03

text

Conclusion

Proposed polynomial approximations for I_0(x) in the form of (2) deliver the lowest relative error compared to all commonly used approximations in mainstream software libraries: Blair & Edwards[1], Numerical Recipes[2], SPECFUN 2.5[3], Boost 1.59.0[4] and even GNU GSL[5] which is frequently used as a measure stick to assess accuracy of other libraries.

Please provide attribution to this page if you use the approximations in your work.

References

  1. Blair J.M., Edwards C.A., Stable rational minimax approximations to the modified Bessel functions I_0(x) and I_1(x). AECL-4928, Chalk River Nuclear Laboratories, Chalk River, Ontario, October 1974.
  2. Press, William H. and Teukolsky, Saul A. and Vetterling, William T. and Flannery, Brian P. Numerical Recipes 3rd Edition: The Art of Scientific Computing, 2007, Cambridge University Press, New York, USA.
  3. Cody W.J., Algorithm 715: SPECFUN – A Portable FORTRAN Package of Special Function Routines and Test Drivers, ACM Transactions on Mathematical Software, Volume 19, Number 1, March 1993, pages 22-32. SPECFUN: Modified Bessel function of the first kind.
  4. Boost – free peer-reviewed portable C++ source libraries. Version 1.59.0. Boost: Modified Bessel function of the first kind.
  5. M. Galassi et al, GNU Scientific Library Reference Manual, 3rd Edition (January 2009), ISBN 0954612078.
  6. MATLAB 2015a, The MathWorks, Inc., Natick, Massachusetts, United States.
  7. The NAG Toolbox for MATLAB, Mark 24 (the newest version).

{ 8 comments… read them below or add one }

John Maddock January 16, 2017 at 7:06 pm

Just to let you know that I’m in the process of implementing this for (I hope) Boost-1.64, some general comments:

* Evaluating your (4) as given may result in spurious overflow – which is to say the exponent term overflows before the result does – it’s easily avoided by multiplying by exp(x/2) twice in succession. But as this adds a little complexity it makes it worth while to split the interval over which (4) is used.
* I only needed 14 terms over [0, 7.75] to achieve double precision (max error found 2.1eps).
* Your approximations extend to 80 and 128-bit real’s fairly easily by dividing up the region of (4) into several sections, I needed to use (3) over [7.75, 15] though as I couldn’t get much past double precision from (4) there: it seems to become ill-conditioned if there are too many terms in that region. As a result the max error creeps up to 4eps over the middle range, which is a shame, but it is what it is for now.
* Reducing precision to float, results in a very compact implementation with 9 and 6 terms respectively for (3) and (4): something CUDA users may appreciate.

The actual coefficients used will appear in the boost git repro shortly.

Reply

Pavel Holoborodko January 17, 2017 at 8:32 am

Thank you very much for your observations and for using the approximations in Boost!

I have just added approximations for quadruple precision (128 bit) to the post. Polynomials become long but this combination delivers 4.6qeps and 3qeps peak relative error (estimated using 1M random arguments on each interval). This is overall minimum among all possible polynomials of < 50 degree.

I confirm that middle range is quite problematic to build approximation for. Two-interval splitting seems to be the best.

Chebyshev expansions might be more efficient for higher levels of precision.

Reply

John Maddock January 18, 2017 at 7:05 pm

The code I used to derive coefficient generates both Chebyshev and Polynomial forms and tests both (since the two forms are inter-convertible), in general the Chebyshev forms are no more, and sometimes less accurate. I also went for more intervals and hence lower order polynomials – it seems to me that the extra comparisons are more than made up for by many fewer multiplications. Error rates were in line with yours.

For the record the code can be seen at:

https://github.com/boostorg/math/blob/develop/include/boost/math/special_functions/detail/bessel_i0.hpp

and

https://github.com/boostorg/math/blob/develop/include/boost/math/special_functions/detail/bessel_i1.hpp

Reply

Marcin April 28, 2021 at 8:26 pm

Thanks for this research and interesting article. I came here while looking for approximation to the ratio of modified Bessel functions I1(x)/I0(x).
Is there any toolkit that helps finding polynomial or rational approximations like the ones used here or in Boost special functions?

Reply

John Maddock April 29, 2021 at 1:46 am

The code I used in Boost is all open source: https://www.boost.org/doc/libs/1_76_0/libs/math/doc/html/math_toolkit/internals/minimax.html You should consider it experimental and prone to exploding – basically anyone devising rational/polynomial approximations is using “one shot” code that’s used once and then discarded. ’tis the nature of the beast! That said, if you dig into the code you’ll find lots of examples to follow. The equations being solved are also well known to being “stiff” if you’re looking for a full minimax approximation, which is the main cause of instability.

With regard to I_1[x]/I_0[x], this at least looks smooth on https://www.wolframalpha.com/input/?i=BesselI%5B1%2Cx%5D%2FBesselI%5B0%2C+x%5D so you do stand a chance. For small x then x/2 looks to a good starting point, e^2x for large x, and heaven only knows what in the middle! Given that I_1 and I_0 are both really cheap to compute, can you not just compute the ratio?

Reply

Marcin April 29, 2021 at 3:34 am

John, thanks for the pointers. I probably could just use I1 and I0, but I started overthinking it. Well, I’d need to handle somehow big values – I1 and I0 quickly go to +inf and the ratio should converge to 1.

That’s an interesting reading material. One thing: the outgoing link to online course at fullerton.edu is 404.

When looking at the boost.math code I was wondering what is the purpose of the initializers such as bessel_i0_initializer? I guess it’s to instantiate data arrays before the program runs, but what’s the advantage?

Reply

John Maddock April 29, 2021 at 4:06 am

Yes the initializers ensure that everything is initialized prior to the start of main(), in pre-C++11 land, this ensured that all initialization is thread safe when the number type is a UDT. Since we’ve now dropped C++03 support, these initializers will gradually go away, albeit C++11 threading support is still remarkably buggy and full of dark corners with many of today’s current compilers 🙁

With regard to large x, can you not just return 1 for x greater than some limit? Not sure what that limit should be, but a bit of experimenting suggests that 1/eps looks to be a reasonable choice.

Reply

Marcin April 29, 2021 at 8:16 am

Thanks for explanation. For now, to get the ratio implemented quickly, I used the single-precision polynomials from Boost, with the common term exp(x)/sqrt(x) reduced. It works fine.

But when I compared it with code based on much older approximations of I0 and I1 (the one listed on p. 378 of Abramowitz and Stegun), this older code, which has similar complexity, gives more accurate I1/Io for large x. I think it’s because it has the same a_0 value in both I0 and I1 polynomials: 0.39894228.
In Boost the constant values differ slightly:
3.98942651588301770e-01f
3.98942115977513013e-01f
and the ratio doesn’t converge so neatly to 1.

Reply

Leave a Comment

Previous post:

Next post: