Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Impl AMOS.zairy in pure julia #463

Draft
wants to merge 34 commits into
base: master
Choose a base branch
from

Conversation

inkydragon
Copy link
Member

@inkydragon inkydragon commented Jan 22, 2024

The pr has not yet reached a state where it is ready for review.

Since there's a lot of additions to the code that make it difficult to review, and the airy function currently passes some tests.
I opened up this draft pr, so that you can play around with the early version and give some feedback.


This pr impl those function in pure julia, as a part of #461

+ airy
  + acai
    + mlri
    + seri
    + asyi
    + bknu
        + uchk
        + kscl
        + gamln

NOTE: I chose to rewrite the airy function first because it relies on the fewest other functions in the call graph.

airy will replace _airy

function _airy(z::Complex{Float64}, id::Int32, kode::Int32)
ai1, ai2 = Ref{Float64}(), Ref{Float64}()
ae1, ae2 = Ref{Int32}(), Ref{Int32}()
ccall((:zairy_,libopenspecfun), Cvoid,

TODO

  • Improve code coverage:
     ┌─────────────────────────────────┬───────┬──────┬──────┬──────┐
     │ Filename                        │ Lines │  Hit │ Miss │    % │
     ├─────────────────────────────────┼───────┼──────┼──────┼──────┤
     │ src/amos/AMOS.jl                │     0 │    0 │    0 │    - │
     │ src/amos/airy.jl                │   162 │  130 │   32 │  80% │
     │ src/amos/const.jl               │     0 │    0 │    0 │    - │
     │ src/amos/subroutines/acai.jl    │    46 │   40 │    6 │  87% │
     │ src/amos/subroutines/asyi.jl    │    97 │   82 │   15 │  85% │
     │ src/amos/subroutines/bknu.jl    │   358 │  263 │   95 │  73% │
     │ src/amos/subroutines/gammaln.jl │    46 │   46 │    0 │ 100% │
     │ src/amos/subroutines/kscl.jl    │    84 │   83 │    1 │  99% │
     │ src/amos/subroutines/mlri.jl    │   128 │  126 │    2 │  98% │
     │ src/amos/subroutines/s1s2.jl    │    28 │   28 │    0 │ 100% │
     │ src/amos/subroutines/seri.jl    │   138 │   85 │   53 │  62% │
     │ src/amos/subroutines/uchk.jl    │    11 │   11 │    0 │ 100% │
     │ src/amos/warp.jl                │    87 │   87 │    0 │ 100% │
      ...
     ├─────────────────────────────────┼───────┼──────┼──────┼──────┤
     │ TOTAL                           │  3962 │ 3534 │  428 │  89% │
     └─────────────────────────────────┴───────┴──────┴──────┴──────┘
    
  • Reduce test cases
     Test Summary:     |     Pass  Fail  Broken     Total   Time
     amos/helper       | 23300804     1      62  23300867  16.7s
       AMOS.uchk       |  5346000                 5346000   3.0s
       AMOS.gammaln    |      166             4       170   0.2s
       AMOS.kscl!      |    12470                   12470   0.9s
       AMOS.s1s2!      | 17915904                17915904  10.0s
       AMOS.asyi!      |     5520                    5520   0.4s
       AMOS.mlri!      |     5104                    5104   0.3s
       AMOS.seri!      |     5104                    5104   0.3s
       AMOS.bknu!      |     5104                    5104   0.4s
       AMOS.acai!      |     5220            58      5278   0.2s
       AMOS.airy       |      212                     212   0.1s
       Show test count |              1                 1   0.6s
    
  • benchmark
  • Clean code, remove debug output (@show, @info, ...)
    • Too many copy & paste in test/amos/helper.jl, and this test file need rename.
  • Improve doc

AMOS call graph

amos

@inkydragon
Copy link
Member Author

Some simple benchmark based on c18e324

tldr:

with: z = ℯ*1e3 + π*1e2*im

fname fortran pure julia
airyai 171.215 ns 42.953 ns
airyaiprime 170.120 ns 42.807 ns
airyaix 359.095 ns 204.269 ns
airyaiprimex 385.369 ns 228.239 ns
full test log

Note we have:

airyai(z::Complex{Float64}) 		= _airy(z, Int32(0), Int32(1))
airyaiprime(z::Complex{Float64}) 	= _airy(z, Int32(1), Int32(1))
airyaix(z::Complex{Float64}) 		= _airy(z, Int32(0), Int32(2))
airyaiprimex(z::Complex{Float64}) 	= _airy(z, Int32(1), Int32(2))

Compare AMOS.airy with SpecialFunctions._airy

airyai: 171.215 ns --> 42.953 ns

julia> z =*1e3 + π*1e2*im
2718.2818284590453 + 314.1592653589793im

julia> @benchmark SpecialFunctions._airy(z, Int32(0), Int32(1))
BenchmarkTools.Trial: 10000 samples with 796 evaluations.
 Range (min  max):  158.040 ns   1.056 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     164.070 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   171.215 ns ± 39.243 ns  ┊ GC (mean ± σ):  0.08% ± 1.11%

  ██▇▇▅▃▁▁                                                   ▁ ▂
  ████████▇▇▇▇▇▆▆▆▇▅▆▆▅▅▅▅▄▅▄▃▄▄▄▄▃▄▃▅▁▁▅▅▅▅▅▄▁▄▁▄▃▁▅▄▄▃▃▃▃▆██ █
  158 ns        Histogram: log(frequency) by time       356 ns <

 Memory estimate: 16 bytes, allocs estimate: 1.

julia> @benchmark AMOS.airy(z, 0, 1)
BenchmarkTools.Trial: 10000 samples with 991 evaluations.
 Range (min  max):  40.666 ns  348.133 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     41.171 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   42.953 ns ±  10.719 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  █▂▂▆▅▂ ▃                                                     ▁
  █████████▇▇▆▆▆▆▅▆▅▅▁▃▃▄▅▅▁▁▄▁▄▃▅▃▁▁▁▃▄▁▄▄▃▁▃▁▁▄▁▃▄▁▃▃▃▄▃▃▁▁▄ █
  40.7 ns       Histogram: log(frequency) by time      78.3 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

airyaiprime: 170.120 ns --> 42.807 ns

julia> @benchmark SpecialFunctions._airy(z, Int32(1), Int32(1))
BenchmarkTools.Trial: 10000 samples with 792 evaluations.
 Range (min  max):  158.081 ns   1.024 μs  ┊ GC (min  max): 0.00%  81.94%
 Time  (median):     166.288 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   170.120 ns ± 30.089 ns  ┊ GC (mean ± σ):  0.10% ±  1.17%

  ▂▄█▃
  ████▅▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▂▂▂▂▂▂▁▂▂▂▂▂▂▂▁▂▂▂▂ ▂
  158 ns          Histogram: frequency by time          349 ns <

 Memory estimate: 16 bytes, allocs estimate: 1.

julia> @benchmark AMOS.airy(z, 1, 1)
BenchmarkTools.Trial: 10000 samples with 992 evaluations.
 Range (min  max):  40.524 ns  309.073 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     42.339 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   42.807 ns ±   9.549 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▆█    ▅█▃ ▄▂▁▁▁ ▂▄▁                                          ▂
  ██▇▇▆████▇█████▇███▇▆▆▅▆▇▅▆▄▅▄▆▆▅▆▅▆▅▅▄▁▄▃▄▁▅▃▃▄▄▃▁▃▄▃▁▃▃▁▄▃ █
  40.5 ns       Histogram: log(frequency) by time        56 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

airyaix: 359.095 ns --> 204.269 ns

julia> @benchmark SpecialFunctions._airy(z, Int32(0), Int32(2))
BenchmarkTools.Trial: 10000 samples with 227 evaluations.
 Range (min  max):  328.194 ns   2.172 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     346.256 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   359.095 ns ± 77.896 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▇█▇█▅▅▃▂▂▁▁                                             ▁  ▁ ▂
  █████████████▇▇▇▇▇▆▅▄▆▅▄▅▅▆▆▅▆▄▅▆▅▅▄▆▃▃▃▃▅▄▆▄▄▁▄▃▅▅▄▁▄▄▇██▆█ █
  328 ns        Histogram: log(frequency) by time       693 ns <

 Memory estimate: 16 bytes, allocs estimate: 1.

julia> @benchmark AMOS.airy(z, 0, 2)
BenchmarkTools.Trial: 10000 samples with 763 evaluations.
 Range (min  max):  163.827 ns    3.247 μs  ┊ GC (min  max): 0.00%  87.28%
 Time  (median):     177.195 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   204.269 ns ± 172.904 ns  ┊ GC (mean ± σ):  6.69% ±  7.60%

  ▆█▇▅▄▂▁▁  ▁▁                 ▂▁                               ▂
  █████████████▇▇▆▆▆▆▆▆▅▆▅▆▆▅▅████▆▇▆▅▄▃▅▄▅▄▃▅▃▄▄▁▄▃▁▁▄▃▁▄▁▁▃▃▃ █
  164 ns        Histogram: log(frequency) by time        604 ns <

 Memory estimate: 448 bytes, allocs estimate: 5.

airyaiprimex: 385.369 ns --> 228.239 ns

julia> @benchmark SpecialFunctions._airy(z, Int32(1), Int32(2))
BenchmarkTools.Trial: 10000 samples with 212 evaluations.
 Range (min  max):  355.189 ns   2.411 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     371.226 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   385.369 ns ± 93.473 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  █▆█▇▅▅▂▁▂▂▁                                             ▂  ▁ ▂
  █████████████▇▆▆▆▄▆▅▄▅▄▆▃▄▄▅▃▄▃▄▅▄▃▅▅▃▃▃▁▁▄▁▁▄▅▃▄▄▄▄▁▄▄▄█▇▅█ █
  355 ns        Histogram: log(frequency) by time       729 ns <

 Memory estimate: 16 bytes, allocs estimate: 1.

julia> @benchmark AMOS.airy(z, 1, 2)
BenchmarkTools.Trial: 10000 samples with 681 evaluations.
 Range (min  max):  183.113 ns    3.848 μs  ┊ GC (min  max): 0.00%  88.07%
 Time  (median):     196.256 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   228.239 ns ± 180.213 ns  ┊ GC (mean ± σ):  5.84% ±  7.25%

  ▇█▇▄▃▁                   ▁▃▃▁                                 ▂
  ████████████▇▇▆▆▆▅▅▄▆▅▅▅▆█████▇▆▅▅▄▁▄▄▄▃▅▁▃▁▁▃▃▁▃▄▃▁▃▁▄▁▁▁▃▄▄ █
  183 ns        Histogram: log(frequency) by time        676 ns <

 Memory estimate: 448 bytes, allocs estimate: 5.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant