Feature request: Owen's t function #99

maximerischard · 2020-06-29T21:34:21Z

Owen's T function currently does not seem to have a native julia implementation. Most existing implementations in Fortran or C are under GPL, which makes this a little tricky.

This would be useful for the Skew Normal distribution in Distributions.jl.

azev77 · 2020-10-25T23:11:28Z

@maximerischard @johnczito
There is a PR here: JuliaMath/SpecialFunctions.jl#242
Distributions.jl calls both StatsFuns.jl & SpecialFunctions.jl

johnczito · 2020-10-26T14:32:48Z

Oh, what do you know. Thanks for the heads up. If that gets merged and released, we can close this and you can open the PR in Distributions.

blackeneth · 2021-07-10T06:20:10Z

Looks like JuliaMath/SpecialFunctions.jl#242 flamed out due to licensing issues -- the code it was derived from was LGPL.

I can work on this next, basing it on Fast and Accurate Calculation of Owen's T Function, which won't have a license issue (The Journal of Statistical Software uses the Creative Commons License).

I'm working on getting the MVN cdf (qsmivnv.jl) over the finish line, and the Johnson Distributions for Distributions.jl right now. Next I can turn my attention to Owen's T.

azev77 · 2021-07-14T22:02:17Z

@blackeneth that would be awesome!
Looking forward to it!

blackeneth · 2021-08-18T04:26:07Z

@azev77 and others ...

History
In the past 20 or so years, most implementations of Owen's T function have followed the algorithms given in "Fast and accurate Calculation of Owen's T-Function", by M. Patefield and D. Tandy, Journal of Statistical Software, 5 (5), 1 - 25 (2000)

Six algorithms were given, and which one used depended on the values of (h,a) in owent(h,a)

T1: first m terms of series expansion of Owen (1956)
T2: approximates 1/(1+x^2) by power series expansion up to order 2m
T3: approximates 1/(1+x^2) by chebyshev polynomials of degree 2m in x
T4: new expression for zi from T2
T5: Gauss 2m-point quadrature; 30 figures accuracy with m=48 (p. 18)
T6: For when a is very close to 1, use formula derived from T(h,1) = 1/2 Φ(h)[1-Φ(h)]

They developed code for these algorithms on a DEC VAX 750. The VAX 750 came out in 1980 and had a processor clock speed of 3.125 MHz.

The reason for 6 algorithms was to speed up the function when possible, with T1 being faster than T2, T2 faster than T3, etc.

TODAY

We have faster and better computers today.

If we skip T1 through T4, we can go straight to 48 point Gauss-Legendre quadrature for a <= 0.999999. For a > 0.999999, we can fall back to quadrature using QuadGK. This is sufficient for 15 digit accuracy for Float64. Gauss-Legendre quadrature takes about 3 microseconds, and QuadGK takes about 20 microseconds.

Questions:

Any objection to this?
do we need BigFloat support?

Code

I call this function "owensteasy(h,a)":

# owenteasy
#    Written by Andy Gough; August 2021 
#    Rev 1.02
Using QuadGK
Using Distributions


function owenteasy(h::T1,a::T1; calc_method=nothing, m=nothing) where {T1 <: Real}

    #*********************
    # shortcut evaluations 
    #*********************

    if h < zero(h) 
        return owenteasy(abs(h),a)
    end 

    if h==zero(h) 
        return arctan(a)*inv2π
    end

    if a < zero(a) 
        return -owenteasy(h,abs(a))
    end 

    if a==zero(a)
        return zero(a)
    end 

    if a==one(a) 
        return 0.125*erfc(-h*invsqrt2)*erfc(h*invsqrt2)
    end 

    if a==Inf 
        return 0.25*erfc(sqrt(h^2)*invsqrt2)
    end 

    # below reduces the range from -inf < h,a < +inf to h ≥ 0, 0 ≤ a ≤ 1
    unitnorm = Normal()
    if a > one(a) 
        return 0.5*cdf(unitnorm,h) + 0.5*cdf(unitnorm,a*h) - cdf(unitnorm,h)*cdf(unitnorm,a*h) - owenteasy(a*h,one(a)/a)
    end 

    # calculate Owen's T 

    if a ≤ 0.999999 

        t2(h,a,x) = inv2π*(a/2)*exp(-0.5*(h^2)*(1+(a*x)^2))/(1+(a*x)^2)

        x = [-0.9987710072524261, -0.9935301722663508, -0.9841245837228269, -0.9705915925462473, -0.9529877031604309, -0.9313866907065543, -0.9058791367155696
        , -0.8765720202742479, -0.8435882616243935, -0.8070662040294426, -0.7671590325157404, -0.7240341309238146, -0.6778723796326639, -0.6288673967765136
        , -0.5772247260839727, -0.5231609747222331, -0.4669029047509584, -0.4086864819907167, -0.34875588629216075, -0.28736248735545555, -0.22476379039468905
        , -0.1612223560688917, -0.0970046992094627, -0.03238017096286937, 0.03238017096286937, 0.0970046992094627, 0.1612223560688917, 0.22476379039468905
        , 0.28736248735545555, 0.34875588629216075, 0.4086864819907167, 0.4669029047509584, 0.5231609747222331, 0.5772247260839727, 0.6288673967765136
        , 0.6778723796326639, 0.7240341309238146, 0.7671590325157404, 0.8070662040294426, 0.8435882616243935, 0.8765720202742479, 0.9058791367155696
        , 0.9313866907065543, 0.9529877031604309, 0.9705915925462473, 0.9841245837228269, 0.9935301722663508, 0.9987710072524261]

        w = [0.0031533460523059122, 0.0073275539012762885, 0.011477234579234613, 0.015579315722943824, 0.01961616045735561, 0.023570760839324363
        , 0.027426509708356944, 0.031167227832798003, 0.03477722256477052, 0.038241351065830737, 0.04154508294346467, 0.0446745608566943, 0.04761665849249045
        , 0.05035903555385445, 0.05289018948519363, 0.055199503699984116, 0.05727729210040322, 0.05911483969839564, 0.06070443916589387, 0.06203942315989268
        , 0.06311419228625402, 0.06392423858464813, 0.06446616443594998, 0.06473769681268386, 0.06473769681268386, 0.06446616443594998, 0.06392423858464813
        , 0.06311419228625402, 0.06203942315989268, 0.06070443916589387, 0.05911483969839564, 0.05727729210040322, 0.055199503699984116
        , 0.05289018948519363, 0.05035903555385445, 0.04761665849249045, 0.0446745608566943, 0.04154508294346467, 0.038241351065830737, 0.03477722256477052
        , 0.031167227832798003, 0.027426509708356944, 0.023570760839324363, 0.01961616045735561, 0.015579315722943824, 0.011477234579234613, 0.0073275539012762885
        , 0.0031533460523059122]

        TOwen = dot(w, t2.(h,a,x))

        return TOwen
    else
        # a > 0.999999, use quadrature 

        tt(h,x) = inv2π*exp(-0.5*(h^2)*(1.0+x^2))/(1.0+x^2)

        TOwen, err = quadgk(x -> tt(h,x),0.0,a, atol=eps())

        return TOwen
    end 
end

You can test it with this code:

# owenteasy test values
using Random
using Test 

hvec = [0.0625, 6.5, 7.0, 4.78125, 2.0, 1.0, 0.0625, 1, 1, 1, 1, 0.5, 0.5, 0.5, 0.5, 0.25, 0.25, 0.25, 0.25, 0.125, 0.125, 0.125, 0.125, 0.0078125
, 0.0078125, 0.0078125, 0.0078125, 0.0078125, 0.0078125, 0.0625, 0.5, 0.9]
avec = [0.25, 0.4375, 0.96875, 0.0625, 0.5, 0.9999975, 0.999999125, 0.5, 1, 2, 3, 0.5, 1, 2, 3, 0.5, 1, 2, 3, 0.5, 1, 2, 3, 0.5, 1, 2, 3, 10, 100
, 0.999999999999999, 0.999999999999999, 0.999999999999999]
cvec = [big"0.0389119302347013668966224771378", big"2.00057730485083154100907167685e-11", big"6.399062719389853083219914429e-13"
, big"1.06329748046874638058307112826e-7", big"0.00862507798552150713113488319155", big"0.0667418089782285927715589822405"
, big"0.1246894855262192" 
, big"0.04306469112078537", big"0.06674188216570097", big"0.0784681869930841", big"0.0792995047488726", big"0.06448860284750375", big"0.1066710629614485"
, big"0.1415806036539784", big"0.1510840430760184", big"0.07134663382271778", big"0.1201285306350883", big"0.1666128410939293", big"0.1847501847929859"
, big"0.07317273327500386", big"0.1237630544953746", big"0.1737438887583106", big"0.1951190307092811", big"0.07378938035365545"
, big"0.1249951430754052", big"0.1761984774738108", big"0.1987772386442824", big"0.2340886964802671", big"0.2479460829231492", big"0.1246895548850744"
, big"0.1066710629614484", big"0.0750909978020473015080760056431386286348318447478899039422181015"]

mvbck = "\033[1A\033[58G"
mva = "\033[72G"
mve = "\033[90G"

warmup = owenteasy(0.0625,0.025)

for i in 1:size(hvec,1)
    if i == 1
        println("\t\tExecution Time","\033[58G","h",mva,"a",mve,"error\t\t","log10 error")
    end 
    h = hvec[i]
    a = avec[i]
    c = cvec[i]
    print(i,"\t")
    @time tx = owenteasy(h,a)
    err = Float64(round(tx - c,sigdigits=4))
    logerr = Float64(round(log10(abs(tx-c)),sigdigits=3))
    println(mvbck,h,mva,a,mve,err,"\t",logerr)
end

Here are the results I get from the above testing:

                 Execution Time                           h             a                 error          log10 error
1         0.000013 seconds (4 allocations: 1.469 KiB)    0.0625        0.25              -9.243e-20     -19.0
2         0.000007 seconds (4 allocations: 1.469 KiB)    6.5           0.4375            -1.521e-27     -26.8
3         0.000003 seconds (4 allocations: 1.469 KiB)    7.0           0.96875           1.365e-28      -27.9
4         0.000003 seconds (4 allocations: 1.469 KiB)    4.78125       0.0625            -3.432e-23     -22.5
5         0.000003 seconds (4 allocations: 1.469 KiB)    2.0           0.5               -6.239e-19     -18.2
6         0.000003 seconds (4 allocations: 1.469 KiB)    1.0           0.9999975         -2.549e-17     -16.6
7         0.000040 seconds (78 allocations: 1.688 KiB)   0.0625        0.999999125       -4.573e-17     -16.3
8         0.000003 seconds (4 allocations: 1.469 KiB)    1.0           0.5               -1.39e-17      -16.9
9         0.000001 seconds (1 allocation: 16 bytes)      1.0           1.0               -1.467e-17     -16.8
10        0.000005 seconds (6 allocations: 1.500 KiB)    1.0           2.0               -1.749e-17     -16.8
11        0.000004 seconds (6 allocations: 1.500 KiB)    1.0           3.0               -4.593e-17     -16.3
12        0.000003 seconds (4 allocations: 1.469 KiB)    0.5           0.5               -1.017e-17     -17.0
13        0.000001 seconds (1 allocation: 16 bytes)      0.5           1.0               1.507e-17      -16.8
14        0.000005 seconds (6 allocations: 1.500 KiB)    0.5           2.0               -1.125e-16     -15.9
15        0.000003 seconds (6 allocations: 1.500 KiB)    0.5           3.0               -4.053e-17     -16.4
16        0.000003 seconds (4 allocations: 1.469 KiB)    0.25          0.5               -4.569e-18     -17.3
17        0.000001 seconds (1 allocation: 16 bytes)      0.25          1.0               -1.35e-18      -17.9
18        0.000003 seconds (6 allocations: 1.500 KiB)    0.25          2.0               1.313e-16      -15.9
19        0.000003 seconds (6 allocations: 1.500 KiB)    0.25          3.0               -4.461e-17     -16.4
20        0.000003 seconds (4 allocations: 1.469 KiB)    0.125         0.5               -1.717e-17     -16.8
21        0.000001 seconds (1 allocation: 16 bytes)      0.125         1.0               -2.796e-17     -16.6
22        0.000005 seconds (6 allocations: 1.500 KiB)    0.125         2.0               -6.153e-17     -16.2
23        0.000004 seconds (6 allocations: 1.500 KiB)    0.125         3.0               4.929e-18      -17.3
24        0.000003 seconds (4 allocations: 1.469 KiB)    0.0078125     0.5               3.781e-18      -17.4
25        0.000001 seconds (1 allocation: 16 bytes)      0.0078125     1.0               -1.026e-18     -18.0
26        0.000003 seconds (6 allocations: 1.500 KiB)    0.0078125     2.0               1.615e-17      -16.8
27        0.000003 seconds (6 allocations: 1.500 KiB)    0.0078125     3.0               2.795e-17      -16.6
28        0.000003 seconds (6 allocations: 1.500 KiB)    0.0078125     10.0              7.214e-17      -16.1
29        0.000003 seconds (6 allocations: 1.500 KiB)    0.0078125     100.0             -4.746e-17     -16.3
30        0.000027 seconds (78 allocations: 1.688 KiB)   0.0625        0.999999999999999 -4.486e-17     -16.3
31        0.000014 seconds (78 allocations: 1.688 KiB)   0.5           0.999999999999999 5.956e-17      -16.2
32        0.000013 seconds (78 allocations: 1.688 KiB)   0.9           0.999999999999999 -5.269e-18     -17.3

azev77 · 2021-08-19T03:07:14Z

@johnczito @andreasnoack @mschauer
@stevengj
This would be useful for Distributions.jl
Including for the skewed Normal/T that I PRed…

stevengj · 2021-08-19T13:01:19Z

Whether using generic quadrature schemes is "sufficient" depends on the use case. It is typically orders of magnitude slower than specialized expansions like continued fractions, so you will probably be losing compared to other languages.

That being said, I agree that the optimal choice of algorithm these days is probably different than that old paper.

mschauer · 2021-08-19T17:25:47Z

Does it make sense to discuss this in the larger context of the package? Are our other implementations typically "as fast as reasonably can be" or is there a continuous work on adding correct but perhaps not optimal implementations and later performance improvements?

azev77 · 2021-08-19T18:18:42Z

@mschauer I agree.
It looks like @blackeneth's implementation is already better than the current stuff that is used.
Users can refer back here to Steven's comments & think about further optimizations in the future.

stevengj · 2021-08-19T18:18:46Z

True, but it's also nice to avoid "embarrassingly slow compared to R or Python", and it would be good to check whether the quadrature approach falls into this category.

blackeneth · 2021-08-22T06:46:02Z

After some micro-optimizations, the speed for calculating Owen's T when a ≤ 0.999999 is about 1 μs:

When 0.999999 < a < 1.0, QuadGK is called and the execution time is about 7.5 μs:

Worst-case error is about 1e-16:

With some additional type stability work and test development, we should be good to go.

Note the function will have dependencies on:

LinearAlgebra
Distributions
QuadGK
IrrationalConstants
SpecialFunctions

devmotion · 2021-08-22T10:25:24Z

As discussed in #114, we should not add a circular dependency on Distributions to StatsFuns. It's seems it is not actually needed and you could just call normcdf etc.

I am also worried that a dependency on QuadGK will make the package slower to load. In the past this has been a reason to not merge PRs (even though I think #106 should be fine 😛 - it only adds a lightweight dependency and IMO the timings are increased only to a small extent).

azev77 · 2021-08-22T17:57:58Z

@blackeneth can you compare your functions performance w implementations in other languages?

blackeneth · 2021-08-23T07:22:34Z

I can replace the Distributions.jl call to Normal() with normcdf(x) = 0.5*erfc(-x*invsqrt2)

As for R & Python, I can't say I'm an expert at benchmarking those, but my initial checks have R at 4.74 μs and Python at 2.5 μs

I will look at using "T6" for 0.999999 < a < 1.0, which could eliminate the QuadGK dependency.

devmotion · 2021-08-23T07:32:20Z

I can replace the Distributions.jl call to Normal() with normcdf(x) = 0.5*erfc(-x*invsqrt2)

That's precisely what I was referring to since normcdf is already defined in StatsFuns (

StatsFuns.jl/src/distrs/norm.jl

Line 45 in 36e48bb

normcdf(z::Number) = erfc(-z * invsqrt2)/2

).

blackeneth · 2021-08-23T08:54:31Z

Using method "T6" for a > 0.999999 has similar accuracy to using QuadGK:

#             time                                       h                   a            error   log10(abs(error))
30        0.000001 seconds (1 allocation: 16 bytes)      0.0625        0.999999999999999 1.362e-18      -17.9
31        0.000001 seconds (1 allocation: 16 bytes)      0.5           0.999999999999999 5.242e-18      -17.3
32        0.000001 seconds (1 allocation: 16 bytes)      0.9           0.999999999999999 8.609e-18      -17.1
33        0.000002 seconds (1 allocation: 16 bytes)      2.5           0.999999999999999 1.22e-17       -16.9
34        0.000002 seconds (1 allocation: 16 bytes)      7.33          0.999999999999999 2.688e-17      -16.6
35        0.000001 seconds (1 allocation: 16 bytes)      0.6           0.999999999999999 -1.346e-17     -16.9
36        0.000002 seconds (1 allocation: 16 bytes)      1.6           0.999999999999999 -5.124e-17     -16.3
37        0.000002 seconds (1 allocation: 16 bytes)      2.33          0.999999999999999 1.616e-18      -17.8
38        0.000007 seconds (4 allocations: 1.469 KiB)    2.33          0.99999           -2.807e-18     -17.6

The performance is quite good for 0.999999 < a < 1.0

Some additional checks for type stability, then a few more tests, and it should be good for a PR.

blackeneth · 2022-05-12T06:31:44Z

The function has been ready for some time. The barrier is git. My VS Code is somehow setup wrong, so it wants to release the wrong files. I rarely use git, so I forget how to use it after every use. But now I'm moving, so won't have time to figure it out and release it for quite a while.

Attached is the code for the function, owent(h,a). Also attached is the code for the tests.

If someone wants to take them and release them, please do so.

owent.txt
OwenT_tests.txt

azev77 mentioned this issue Jul 1, 2020

Porting to Distributions.jl? STOR-i/SkewDist.jl#3

Open

johnczito mentioned this issue Jul 9, 2020

Add SkewNormal distribution JuliaStats/Distributions.jl#1104

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: Owen's t function #99

Feature request: Owen's t function #99

maximerischard commented Jun 29, 2020

azev77 commented Oct 25, 2020

johnczito commented Oct 26, 2020

blackeneth commented Jul 10, 2021

azev77 commented Jul 14, 2021

blackeneth commented Aug 18, 2021

azev77 commented Aug 19, 2021

stevengj commented Aug 19, 2021 •

edited

Loading

mschauer commented Aug 19, 2021

azev77 commented Aug 19, 2021

stevengj commented Aug 19, 2021 •

edited

Loading

blackeneth commented Aug 22, 2021

devmotion commented Aug 22, 2021

azev77 commented Aug 22, 2021 •

edited

Loading

blackeneth commented Aug 23, 2021

devmotion commented Aug 23, 2021

blackeneth commented Aug 23, 2021 •

edited

Loading

blackeneth commented May 12, 2022

Feature request: Owen's t function #99

Feature request: Owen's t function #99

Comments

maximerischard commented Jun 29, 2020

azev77 commented Oct 25, 2020

johnczito commented Oct 26, 2020

blackeneth commented Jul 10, 2021

azev77 commented Jul 14, 2021

blackeneth commented Aug 18, 2021

azev77 commented Aug 19, 2021

stevengj commented Aug 19, 2021 • edited Loading

mschauer commented Aug 19, 2021

azev77 commented Aug 19, 2021

stevengj commented Aug 19, 2021 • edited Loading

blackeneth commented Aug 22, 2021

devmotion commented Aug 22, 2021

azev77 commented Aug 22, 2021 • edited Loading

blackeneth commented Aug 23, 2021

devmotion commented Aug 23, 2021

blackeneth commented Aug 23, 2021 • edited Loading

blackeneth commented May 12, 2022

stevengj commented Aug 19, 2021 •

edited

Loading

stevengj commented Aug 19, 2021 •

edited

Loading

azev77 commented Aug 22, 2021 •

edited

Loading

blackeneth commented Aug 23, 2021 •

edited

Loading