Disney BSSRDF, sample scattering profile, upper

Percentage Accurate: 95.8% → 97.9%
Time: 44.5s
Alternatives: 9
Speedup: 1.4×

Specification

?
\[\left(0 \leq s \land s \leq 256\right) \land \left(0.25 \leq u \land u \leq 1\right)\]
\[\left(3 \cdot s\right) \cdot \log \left(\frac{1}{1 - \frac{u - 0.25}{0.75}}\right) \]
(FPCore (s u)
  :precision binary32
  :pre (and (and (<= 0.0 s) (<= s 256.0)) (and (<= 0.25 u) (<= u 1.0)))
  (* (* 3.0 s) (log (/ 1.0 (- 1.0 (/ (- u 0.25) 0.75))))))
float code(float s, float u) {
	return (3.0f * s) * logf((1.0f / (1.0f - ((u - 0.25f) / 0.75f))));
}
real(4) function code(s, u)
use fmin_fmax_functions
    real(4), intent (in) :: s
    real(4), intent (in) :: u
    code = (3.0e0 * s) * log((1.0e0 / (1.0e0 - ((u - 0.25e0) / 0.75e0))))
end function
function code(s, u)
	return Float32(Float32(Float32(3.0) * s) * log(Float32(Float32(1.0) / Float32(Float32(1.0) - Float32(Float32(u - Float32(0.25)) / Float32(0.75))))))
end
function tmp = code(s, u)
	tmp = (single(3.0) * s) * log((single(1.0) / (single(1.0) - ((u - single(0.25)) / single(0.75)))));
end
\left(3 \cdot s\right) \cdot \log \left(\frac{1}{1 - \frac{u - 0.25}{0.75}}\right)

Local Percentage Accuracy vs ?

The average percentage accuracy by input value. Horizontal axis shows value of an input variable; the variable is choosen in the title. Vertical axis is accuracy; higher is better. Red represent the original program, while blue represents Herbie's suggestion. These can be toggled with buttons below the plot. The line is an average while dots represent individual samples.

Accuracy vs Speed?

Herbie found 9 alternatives:

AlternativeAccuracySpeedup
The accuracy (vertical axis) and speed (horizontal axis) of each alternatives. Up and to the right is better. The red square shows the initial program, and each blue circle shows an alternative.The line shows the best available speed-accuracy tradeoffs.

Initial Program: 95.8% accurate, 1.0× speedup?

\[\left(0 \leq s \land s \leq 256\right) \land \left(0.25 \leq u \land u \leq 1\right)\]
\[\left(3 \cdot s\right) \cdot \log \left(\frac{1}{1 - \frac{u - 0.25}{0.75}}\right) \]
(FPCore (s u)
  :precision binary32
  :pre (and (and (<= 0.0 s) (<= s 256.0)) (and (<= 0.25 u) (<= u 1.0)))
  (* (* 3.0 s) (log (/ 1.0 (- 1.0 (/ (- u 0.25) 0.75))))))
float code(float s, float u) {
	return (3.0f * s) * logf((1.0f / (1.0f - ((u - 0.25f) / 0.75f))));
}
real(4) function code(s, u)
use fmin_fmax_functions
    real(4), intent (in) :: s
    real(4), intent (in) :: u
    code = (3.0e0 * s) * log((1.0e0 / (1.0e0 - ((u - 0.25e0) / 0.75e0))))
end function
function code(s, u)
	return Float32(Float32(Float32(3.0) * s) * log(Float32(Float32(1.0) / Float32(Float32(1.0) - Float32(Float32(u - Float32(0.25)) / Float32(0.75))))))
end
function tmp = code(s, u)
	tmp = (single(3.0) * s) * log((single(1.0) / (single(1.0) - ((u - single(0.25)) / single(0.75)))));
end
\left(3 \cdot s\right) \cdot \log \left(\frac{1}{1 - \frac{u - 0.25}{0.75}}\right)

Alternative 1: 97.9% accurate, 1.2× speedup?

\[\left(0 \leq s \land s \leq 256\right) \land \left(0.25 \leq u \land u \leq 1\right)\]
\[s \cdot \left(-3 \cdot \mathsf{log1p}\left(\mathsf{fma}\left(u, -1.3333333333333333, 0.3333333333333333\right)\right)\right) \]
(FPCore (s u)
  :precision binary32
  :pre (and (and (<= 0.0 s) (<= s 256.0)) (and (<= 0.25 u) (<= u 1.0)))
  (* s (* -3.0 (log1p (fma u -1.3333333333333333 0.3333333333333333)))))
float code(float s, float u) {
	return s * (-3.0f * log1pf(fmaf(u, -1.3333333333333333f, 0.3333333333333333f)));
}
function code(s, u)
	return Float32(s * Float32(Float32(-3.0) * log1p(fma(u, Float32(-1.3333333333333333), Float32(0.3333333333333333)))))
end
s \cdot \left(-3 \cdot \mathsf{log1p}\left(\mathsf{fma}\left(u, -1.3333333333333333, 0.3333333333333333\right)\right)\right)
Derivation
  1. Initial program 95.8%

    \[\left(3 \cdot s\right) \cdot \log \left(\frac{1}{1 - \frac{u - 0.25}{0.75}}\right) \]
  2. Applied rewrites96.2%

    \[\leadsto s \cdot \left(-3 \cdot \log \left(\left(u - 1\right) \cdot -1.3333333333333333\right)\right) \]
  3. Applied rewrites95.4%

    \[\leadsto s \cdot \left(-3 \cdot \left(\log 1.3333333333333333 + \log \left(\left|1 - u\right|\right)\right)\right) \]
  4. Applied rewrites95.6%

    \[\leadsto s \cdot \left(-3 \cdot \left(\log 1.3333333333333333 + \log \left(\left|1 - 2.6666666666666665 \cdot \left(0.375 \cdot u\right)\right|\right)\right)\right) \]
  5. Applied rewrites97.9%

    \[\leadsto s \cdot \left(-3 \cdot \mathsf{log1p}\left(\mathsf{fma}\left(u, -1.3333333333333333, 0.3333333333333333\right)\right)\right) \]
  6. Add Preprocessing

Alternative 2: 96.7% accurate, 1.4× speedup?

\[\left(0 \leq s \land s \leq 256\right) \land \left(0.25 \leq u \land u \leq 1\right)\]
\[-3 \cdot \left(\log \left(\mathsf{fma}\left(u, -1.3333333333333333, 1.3333333333333333\right)\right) \cdot s\right) \]
(FPCore (s u)
  :precision binary32
  :pre (and (and (<= 0.0 s) (<= s 256.0)) (and (<= 0.25 u) (<= u 1.0)))
  (* -3.0 (* (log (fma u -1.3333333333333333 1.3333333333333333)) s)))
float code(float s, float u) {
	return -3.0f * (logf(fmaf(u, -1.3333333333333333f, 1.3333333333333333f)) * s);
}
function code(s, u)
	return Float32(Float32(-3.0) * Float32(log(fma(u, Float32(-1.3333333333333333), Float32(1.3333333333333333))) * s))
end
-3 \cdot \left(\log \left(\mathsf{fma}\left(u, -1.3333333333333333, 1.3333333333333333\right)\right) \cdot s\right)
Derivation
  1. Initial program 95.8%

    \[\left(3 \cdot s\right) \cdot \log \left(\frac{1}{1 - \frac{u - 0.25}{0.75}}\right) \]
  2. Applied rewrites96.2%

    \[\leadsto s \cdot \left(-3 \cdot \log \left(\left(u - 1\right) \cdot -1.3333333333333333\right)\right) \]
  3. Applied rewrites96.7%

    \[\leadsto -3 \cdot \left(\log \left(\mathsf{fma}\left(u, -1.3333333333333333, 1.3333333333333333\right)\right) \cdot s\right) \]
  4. Add Preprocessing

Alternative 3: 36.6% accurate, 1.4× speedup?

\[\left(0 \leq s \land s \leq 256\right) \land \left(0.25 \leq u \land u \leq 1\right)\]
\[s \cdot \left(u \cdot \left(3 + u \cdot \left(1.5 + u\right)\right) - 0.8630462288856506\right) \]
(FPCore (s u)
  :precision binary32
  :pre (and (and (<= 0.0 s) (<= s 256.0)) (and (<= 0.25 u) (<= u 1.0)))
  (* s (- (* u (+ 3.0 (* u (+ 1.5 u)))) 0.8630462288856506)))
float code(float s, float u) {
	return s * ((u * (3.0f + (u * (1.5f + u)))) - 0.8630462288856506f);
}
real(4) function code(s, u)
use fmin_fmax_functions
    real(4), intent (in) :: s
    real(4), intent (in) :: u
    code = s * ((u * (3.0e0 + (u * (1.5e0 + u)))) - 0.8630462288856506e0)
end function
function code(s, u)
	return Float32(s * Float32(Float32(u * Float32(Float32(3.0) + Float32(u * Float32(Float32(1.5) + u)))) - Float32(0.8630462288856506)))
end
function tmp = code(s, u)
	tmp = s * ((u * (single(3.0) + (u * (single(1.5) + u)))) - single(0.8630462288856506));
end
s \cdot \left(u \cdot \left(3 + u \cdot \left(1.5 + u\right)\right) - 0.8630462288856506\right)
Derivation
  1. Initial program 95.8%

    \[\left(3 \cdot s\right) \cdot \log \left(\frac{1}{1 - \frac{u - 0.25}{0.75}}\right) \]
  2. Applied rewrites96.2%

    \[\leadsto s \cdot \left(-3 \cdot \log \left(\left(u - 1\right) \cdot -1.3333333333333333\right)\right) \]
  3. Applied rewrites95.5%

    \[\leadsto s \cdot \mathsf{fma}\left(-3, \log \left(\left|1 - u\right|\right), -3 \cdot \log 1.3333333333333333\right) \]
  4. Evaluated real constant96.7%

    \[\leadsto s \cdot \mathsf{fma}\left(-3, \log \left(\left|1 - u\right|\right), -0.8630462288856506\right) \]
  5. Taylor expanded in u around 0

    \[\leadsto s \cdot \left(u \cdot \left(3 + u \cdot \left(\frac{3}{2} + u\right)\right) - \frac{14479513}{16777216}\right) \]
  6. Applied rewrites36.6%

    \[\leadsto s \cdot \left(u \cdot \left(3 + u \cdot \left(1.5 + u\right)\right) - 0.8630462288856506\right) \]
  7. Add Preprocessing

Alternative 4: 32.0% accurate, 1.7× speedup?

\[\left(0 \leq s \land s \leq 256\right) \land \left(0.25 \leq u \land u \leq 1\right)\]
\[s \cdot \left(u \cdot \left(3 + 1.5 \cdot u\right) - 0.8630462288856506\right) \]
(FPCore (s u)
  :precision binary32
  :pre (and (and (<= 0.0 s) (<= s 256.0)) (and (<= 0.25 u) (<= u 1.0)))
  (* s (- (* u (+ 3.0 (* 1.5 u))) 0.8630462288856506)))
float code(float s, float u) {
	return s * ((u * (3.0f + (1.5f * u))) - 0.8630462288856506f);
}
real(4) function code(s, u)
use fmin_fmax_functions
    real(4), intent (in) :: s
    real(4), intent (in) :: u
    code = s * ((u * (3.0e0 + (1.5e0 * u))) - 0.8630462288856506e0)
end function
function code(s, u)
	return Float32(s * Float32(Float32(u * Float32(Float32(3.0) + Float32(Float32(1.5) * u))) - Float32(0.8630462288856506)))
end
function tmp = code(s, u)
	tmp = s * ((u * (single(3.0) + (single(1.5) * u))) - single(0.8630462288856506));
end
s \cdot \left(u \cdot \left(3 + 1.5 \cdot u\right) - 0.8630462288856506\right)
Derivation
  1. Initial program 95.8%

    \[\left(3 \cdot s\right) \cdot \log \left(\frac{1}{1 - \frac{u - 0.25}{0.75}}\right) \]
  2. Applied rewrites96.2%

    \[\leadsto s \cdot \left(-3 \cdot \log \left(\left(u - 1\right) \cdot -1.3333333333333333\right)\right) \]
  3. Applied rewrites95.5%

    \[\leadsto s \cdot \mathsf{fma}\left(-3, \log \left(\left|1 - u\right|\right), -3 \cdot \log 1.3333333333333333\right) \]
  4. Evaluated real constant96.7%

    \[\leadsto s \cdot \mathsf{fma}\left(-3, \log \left(\left|1 - u\right|\right), -0.8630462288856506\right) \]
  5. Taylor expanded in u around 0

    \[\leadsto s \cdot \left(u \cdot \left(3 + \frac{3}{2} \cdot u\right) - \frac{14479513}{16777216}\right) \]
  6. Applied rewrites32.0%

    \[\leadsto s \cdot \left(u \cdot \left(3 + 1.5 \cdot u\right) - 0.8630462288856506\right) \]
  7. Add Preprocessing

Alternative 5: 25.6% accurate, 2.7× speedup?

\[\left(0 \leq s \land s \leq 256\right) \land \left(0.25 \leq u \land u \leq 1\right)\]
\[s \cdot \left(3 \cdot u - 0.8630462288856506\right) \]
(FPCore (s u)
  :precision binary32
  :pre (and (and (<= 0.0 s) (<= s 256.0)) (and (<= 0.25 u) (<= u 1.0)))
  (* s (- (* 3.0 u) 0.8630462288856506)))
float code(float s, float u) {
	return s * ((3.0f * u) - 0.8630462288856506f);
}
real(4) function code(s, u)
use fmin_fmax_functions
    real(4), intent (in) :: s
    real(4), intent (in) :: u
    code = s * ((3.0e0 * u) - 0.8630462288856506e0)
end function
function code(s, u)
	return Float32(s * Float32(Float32(Float32(3.0) * u) - Float32(0.8630462288856506)))
end
function tmp = code(s, u)
	tmp = s * ((single(3.0) * u) - single(0.8630462288856506));
end
s \cdot \left(3 \cdot u - 0.8630462288856506\right)
Derivation
  1. Initial program 95.8%

    \[\left(3 \cdot s\right) \cdot \log \left(\frac{1}{1 - \frac{u - 0.25}{0.75}}\right) \]
  2. Applied rewrites96.2%

    \[\leadsto s \cdot \left(-3 \cdot \log \left(\left(u - 1\right) \cdot -1.3333333333333333\right)\right) \]
  3. Applied rewrites95.5%

    \[\leadsto s \cdot \mathsf{fma}\left(-3, \log \left(\left|1 - u\right|\right), -3 \cdot \log 1.3333333333333333\right) \]
  4. Evaluated real constant96.7%

    \[\leadsto s \cdot \mathsf{fma}\left(-3, \log \left(\left|1 - u\right|\right), -0.8630462288856506\right) \]
  5. Taylor expanded in u around 0

    \[\leadsto s \cdot \left(3 \cdot u - \frac{14479513}{16777216}\right) \]
  6. Applied rewrites25.6%

    \[\leadsto s \cdot \left(3 \cdot u - 0.8630462288856506\right) \]
  7. Add Preprocessing

Alternative 6: 25.6% accurate, 2.7× speedup?

\[\left(0 \leq s \land s \leq 256\right) \land \left(0.25 \leq u \land u \leq 1\right)\]
\[3 \cdot \left(s \cdot \left(u + -0.28768208622932434\right)\right) \]
(FPCore (s u)
  :precision binary32
  :pre (and (and (<= 0.0 s) (<= s 256.0)) (and (<= 0.25 u) (<= u 1.0)))
  (* 3.0 (* s (+ u -0.28768208622932434))))
float code(float s, float u) {
	return 3.0f * (s * (u + -0.28768208622932434f));
}
real(4) function code(s, u)
use fmin_fmax_functions
    real(4), intent (in) :: s
    real(4), intent (in) :: u
    code = 3.0e0 * (s * (u + (-0.28768208622932434e0)))
end function
function code(s, u)
	return Float32(Float32(3.0) * Float32(s * Float32(u + Float32(-0.28768208622932434))))
end
function tmp = code(s, u)
	tmp = single(3.0) * (s * (u + single(-0.28768208622932434)));
end
3 \cdot \left(s \cdot \left(u + -0.28768208622932434\right)\right)
Derivation
  1. Initial program 95.8%

    \[\left(3 \cdot s\right) \cdot \log \left(\frac{1}{1 - \frac{u - 0.25}{0.75}}\right) \]
  2. Taylor expanded in s around 0

    \[\leadsto 3 \cdot \left(s \cdot \log \left(\frac{1}{1 - \frac{4}{3} \cdot \left(u - \frac{1}{4}\right)}\right)\right) \]
  3. Applied rewrites95.6%

    \[\leadsto 3 \cdot \left(s \cdot \log \left(\frac{1}{1 - 1.3333333333333333 \cdot \left(u - 0.25\right)}\right)\right) \]
  4. Taylor expanded in u around 0

    \[\leadsto 3 \cdot \left(s \cdot \left(u + \log \frac{3}{4}\right)\right) \]
  5. Applied rewrites25.6%

    \[\leadsto 3 \cdot \left(s \cdot \left(u + \log 0.75\right)\right) \]
  6. Evaluated real constant25.6%

    \[\leadsto 3 \cdot \left(s \cdot \left(u + -0.28768208622932434\right)\right) \]
  7. Add Preprocessing

Alternative 7: 10.6% accurate, 3.7× speedup?

\[\left(0 \leq s \land s \leq 256\right) \land \left(0.25 \leq u \land u \leq 1\right)\]
\[0 \cdot \left(s \cdot -0.28768208622932434\right) \]
(FPCore (s u)
  :precision binary32
  :pre (and (and (<= 0.0 s) (<= s 256.0)) (and (<= 0.25 u) (<= u 1.0)))
  (* 0.0 (* s -0.28768208622932434)))
float code(float s, float u) {
	return 0.0f * (s * -0.28768208622932434f);
}
real(4) function code(s, u)
use fmin_fmax_functions
    real(4), intent (in) :: s
    real(4), intent (in) :: u
    code = 0.0e0 * (s * (-0.28768208622932434e0))
end function
function code(s, u)
	return Float32(Float32(0.0) * Float32(s * Float32(-0.28768208622932434)))
end
function tmp = code(s, u)
	tmp = single(0.0) * (s * single(-0.28768208622932434));
end
0 \cdot \left(s \cdot -0.28768208622932434\right)
Derivation
  1. Initial program 95.8%

    \[\left(3 \cdot s\right) \cdot \log \left(\frac{1}{1 - \frac{u - 0.25}{0.75}}\right) \]
  2. Taylor expanded in u around 0

    \[\leadsto 3 \cdot \left(s \cdot \log \frac{3}{4}\right) \]
  3. Applied rewrites7.5%

    \[\leadsto 3 \cdot \left(s \cdot \log 0.75\right) \]
  4. Evaluated real constant7.5%

    \[\leadsto 3 \cdot \left(s \cdot -0.28768208622932434\right) \]
  5. Taylor expanded in undef-var around zero

    \[\leadsto 0 \cdot \left(s \cdot -0.28768208622932434\right) \]
  6. Applied rewrites10.6%

    \[\leadsto 0 \cdot \left(s \cdot -0.28768208622932434\right) \]
  7. Add Preprocessing

Alternative 8: 7.5% accurate, 6.4× speedup?

\[\left(0 \leq s \land s \leq 256\right) \land \left(0.25 \leq u \land u \leq 1\right)\]
\[s \cdot -0.8630462288856506 \]
(FPCore (s u)
  :precision binary32
  :pre (and (and (<= 0.0 s) (<= s 256.0)) (and (<= 0.25 u) (<= u 1.0)))
  (* s -0.8630462288856506))
float code(float s, float u) {
	return s * -0.8630462288856506f;
}
real(4) function code(s, u)
use fmin_fmax_functions
    real(4), intent (in) :: s
    real(4), intent (in) :: u
    code = s * (-0.8630462288856506e0)
end function
function code(s, u)
	return Float32(s * Float32(-0.8630462288856506))
end
function tmp = code(s, u)
	tmp = s * single(-0.8630462288856506);
end
s \cdot -0.8630462288856506
Derivation
  1. Initial program 95.8%

    \[\left(3 \cdot s\right) \cdot \log \left(\frac{1}{1 - \frac{u - 0.25}{0.75}}\right) \]
  2. Applied rewrites96.2%

    \[\leadsto s \cdot \left(-3 \cdot \log \left(\left(u - 1\right) \cdot -1.3333333333333333\right)\right) \]
  3. Applied rewrites95.5%

    \[\leadsto s \cdot \mathsf{fma}\left(-3, \log \left(\left|1 - u\right|\right), -3 \cdot \log 1.3333333333333333\right) \]
  4. Evaluated real constant96.7%

    \[\leadsto s \cdot \mathsf{fma}\left(-3, \log \left(\left|1 - u\right|\right), -0.8630462288856506\right) \]
  5. Taylor expanded in u around 0

    \[\leadsto s \cdot \frac{-14479513}{16777216} \]
  6. Applied rewrites7.5%

    \[\leadsto s \cdot -0.8630462288856506 \]
  7. Add Preprocessing

Alternative 9: 7.5% accurate, 6.4× speedup?

\[\left(0 \leq s \land s \leq 256\right) \land \left(0.25 \leq u \land u \leq 1\right)\]
\[-0.863046258687973 \cdot s \]
(FPCore (s u)
  :precision binary32
  :pre (and (and (<= 0.0 s) (<= s 256.0)) (and (<= 0.25 u) (<= u 1.0)))
  (* -0.863046258687973 s))
float code(float s, float u) {
	return -0.863046258687973f * s;
}
real(4) function code(s, u)
use fmin_fmax_functions
    real(4), intent (in) :: s
    real(4), intent (in) :: u
    code = (-0.863046258687973e0) * s
end function
function code(s, u)
	return Float32(Float32(-0.863046258687973) * s)
end
function tmp = code(s, u)
	tmp = single(-0.863046258687973) * s;
end
-0.863046258687973 \cdot s
Derivation
  1. Initial program 95.8%

    \[\left(3 \cdot s\right) \cdot \log \left(\frac{1}{1 - \frac{u - 0.25}{0.75}}\right) \]
  2. Taylor expanded in u around 0

    \[\leadsto 3 \cdot \left(s \cdot \log \frac{3}{4}\right) \]
  3. Applied rewrites7.5%

    \[\leadsto 3 \cdot \left(s \cdot \log 0.75\right) \]
  4. Evaluated real constant7.5%

    \[\leadsto 3 \cdot \left(s \cdot -0.28768208622932434\right) \]
  5. Applied rewrites7.5%

    \[\leadsto -0.863046258687973 \cdot s \]
  6. Add Preprocessing

Reproduce

?
herbie shell --seed 2026089 +o generate:egglog
(FPCore (s u)
  :name "Disney BSSRDF, sample scattering profile, upper"
  :precision binary32
  :pre (and (and (<= 0.0 s) (<= s 256.0)) (and (<= 0.25 u) (<= u 1.0)))
  (* (* 3.0 s) (log (/ 1.0 (- 1.0 (/ (- u 0.25) 0.75))))))