Added erf(x) Float64 and Float32 Julia implementations #491

AhmedYKadah · 2025-03-31T15:47:05Z

Faster than current wrapper function call (including Float32 function call).
Uses algorithm based on https://github.com/ARM-software/optimized-routines/blob/master/math/erf.c

codecov · 2025-03-31T16:29:17Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 94.16%. Comparing base (46a2874) to head (462a3cf).

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #491      +/-   ##
==========================================
+ Coverage   94.02%   94.16%   +0.13%     
==========================================
  Files          14       14              
  Lines        2897     2965      +68     
==========================================
+ Hits         2724     2792      +68     
  Misses        173      173

Flag	Coverage Δ
unittests	`94.16% <100.00%> (+0.13%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

AhmedYKadah · 2025-03-31T17:28:21Z

Old:
Float 64
@benchmark SpecialFunctions.erf(data) setup=(data=6*rand(Float64)-3) samples=1000000

BenchmarkTools.Trial: 217729 samples with 1000 evaluations per sample.
Range (min … max): 6.300 ns … 283.700 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 29.500 ns ┊ GC (median): 0.00%
Time (mean ± σ): 21.993 ns ± 13.209 ns ┊ GC (mean ± σ): 0.00% ± 0.00%

Float32
@benchmark SpecialFunctions.erf(data) setup=(data=6*rand(Float32)-3) samples=1000000

BenchmarkTools.Trial: 312732 samples with 1000 evaluations per sample.
Range (min … max): 4.300 ns … 125.100 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 19.900 ns ┊ GC (median): 0.00%
Time (mean ± σ): 15.035 ns ± 7.951 ns ┊ GC (mean ± σ): 0.00% ± 0.00%

New:
Float64
@benchmark erf(data) setup=(data=6*rand(Float64)-3) samples=1000000

BenchmarkTools.Trial: 507504 samples with 1000 evaluations per sample.
Range (min … max): 5.400 ns … 4.890 μs ┊ GC (min … max): 0.00% … 0.00%
Time (median): 8.700 ns ┊ GC (median): 0.00%
Time (mean ± σ): 8.775 ns ± 9.855 ns ┊ GC (mean ± σ): 0.00% ± 0.00%

Float32
@benchmark Float32(erf(data)) setup=(data=6*rand(Float64)-3) samples=1000000

BenchmarkTools.Trial: 526797 samples with 1000 evaluations per sample.
Range (min … max): 5.400 ns … 195.500 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 8.800 ns ┊ GC (median): 0.00%
Time (mean ± σ): 8.521 ns ± 2.236 ns ┊ GC (mean ± σ): 0.00% ± 0.00%

AhmedYKadah · 2025-03-31T17:30:33Z

Float32 implementation available, but not faster than Float64 version due to a exp() call.
Float64 version still faster than old Float32.

AhmedYKadah · 2025-04-05T13:09:02Z

need to clean up polynomial evaluations.
code also could use more organization

AhmedYKadah · 2025-08-02T08:07:59Z

Remaining: erfc Float64 and Float32 implementations, and the erf Float32 implementation

src/erf.jl

…necessary whitespace, and removed explicit copysigns

src/erf.jl

oscardssmith · 2025-09-14T00:59:31Z

src/erf.jl

-
 end

 _erf(x::Float16)=Float16(_erf(Float32(x)))


if you wanted to do a Float16 impl, it should be easier than the others. Specifically, the domain is only to 2, and the accuracy required is much reduced.

100% could wait for a followup PR.

I'm thinking that too to be honest.
this and the poli regen.

oscardssmith · 2025-09-15T03:47:22Z

Given that this is faster and accurate, seems good to merge to me!

mschauer · 2025-09-16T06:30:08Z

Are there any tests for edge cases/ULP in the c version we do not do ourselves?

devmotion

The implementation does not handle NaN32 and NaN16 correctly:

julia> erf(NaN32)
1.0f0

julia> erf(NaN16)
Float16(1.0)

src/erf.jl

mschauer · 2025-09-16T07:17:24Z

Then we should also add a test for these

Co-authored-by: David Müller-Widmann <[email protected]>

src/erf.jl

Co-authored-by: David Müller-Widmann <[email protected]>

AhmedYKadah · 2025-09-19T09:44:51Z

There aren't any tests for erfc. Is that expected?

AhmedYKadah · 2025-09-19T10:56:44Z

Any other changes needed?

oscardssmith · 2025-09-19T12:57:28Z

we should probably should test erfc.

test/erf.jl

oscardssmith · 2025-11-14T13:21:19Z

Other than missing tests for Inf, looks good to me. @devmotion any further sugestions?

src/erf.jl

AhmedYKadah added 5 commits March 31, 2025 17:16

Added erf(x) Float64/Float32 Julia implementation

6f554ef

changed erf to _erf, got rid of unnecessary branch

7f4fd2d

fixed syntax error in ccall

e784c9f

fixed syntax error in ccall 2

da16cb1

NaN edge case for erf(x)

0a755b6

added test cases for erf(x)

6efcec8

AhmedYKadah and others added 2 commits August 2, 2025 10:19

Merge branch 'master' into erf(x)-implementation

0fc6d4d

cleaned up erf(Float64)

3cee8ce

AhmedYKadah added 4 commits September 14, 2025 02:40

added erf(x::Float32) implementation

b819c58

added NaN edge case to erf(x::Float32)

57bbaf2

Merge branch 'master' into erf(x)-implementation

5ad8278

reversed NaN check

26b3b1f

AhmedYKadah changed the title ~~Added erf(x) Float64 Julia implementation~~ Added erf(x) Float64 and Float32 Julia implementations Sep 14, 2025