Disney BSSRDF, sample scattering profile, lower

Percentage Accurate: 61.2% → 99.3%
Time: 4.4s
Alternatives: 10
Speedup: 2.8×

Specification

?
\[\left(0 \leq s \land s \leq 256\right) \land \left(2.328306437 \cdot 10^{-10} \leq u \land u \leq 0.25\right)\]
\[s \cdot \log \left(\frac{1}{1 - 4 \cdot u}\right) \]
(FPCore (s u)
  :precision binary32
  :pre (and (and (<= 0.0 s) (<= s 256.0))
     (and (<= 2.328306437e-10 u) (<= u 0.25)))
  (* s (log (/ 1.0 (- 1.0 (* 4.0 u))))))
float code(float s, float u) {
	return s * logf((1.0f / (1.0f - (4.0f * u))));
}
real(4) function code(s, u)
use fmin_fmax_functions
    real(4), intent (in) :: s
    real(4), intent (in) :: u
    code = s * log((1.0e0 / (1.0e0 - (4.0e0 * u))))
end function
function code(s, u)
	return Float32(s * log(Float32(Float32(1.0) / Float32(Float32(1.0) - Float32(Float32(4.0) * u)))))
end
function tmp = code(s, u)
	tmp = s * log((single(1.0) / (single(1.0) - (single(4.0) * u))));
end
s \cdot \log \left(\frac{1}{1 - 4 \cdot u}\right)

Local Percentage Accuracy vs ?

The average percentage accuracy by input value. Horizontal axis shows value of an input variable; the variable is choosen in the title. Vertical axis is accuracy; higher is better. Red represent the original program, while blue represents Herbie's suggestion. These can be toggled with buttons below the plot. The line is an average while dots represent individual samples.

Accuracy vs Speed?

Herbie found 10 alternatives:

AlternativeAccuracySpeedup
The accuracy (vertical axis) and speed (horizontal axis) of each alternatives. Up and to the right is better. The red square shows the initial program, and each blue circle shows an alternative.The line shows the best available speed-accuracy tradeoffs.

Initial Program: 61.2% accurate, 1.0× speedup?

\[\left(0 \leq s \land s \leq 256\right) \land \left(2.328306437 \cdot 10^{-10} \leq u \land u \leq 0.25\right)\]
\[s \cdot \log \left(\frac{1}{1 - 4 \cdot u}\right) \]
(FPCore (s u)
  :precision binary32
  :pre (and (and (<= 0.0 s) (<= s 256.0))
     (and (<= 2.328306437e-10 u) (<= u 0.25)))
  (* s (log (/ 1.0 (- 1.0 (* 4.0 u))))))
float code(float s, float u) {
	return s * logf((1.0f / (1.0f - (4.0f * u))));
}
real(4) function code(s, u)
use fmin_fmax_functions
    real(4), intent (in) :: s
    real(4), intent (in) :: u
    code = s * log((1.0e0 / (1.0e0 - (4.0e0 * u))))
end function
function code(s, u)
	return Float32(s * log(Float32(Float32(1.0) / Float32(Float32(1.0) - Float32(Float32(4.0) * u)))))
end
function tmp = code(s, u)
	tmp = s * log((single(1.0) / (single(1.0) - (single(4.0) * u))));
end
s \cdot \log \left(\frac{1}{1 - 4 \cdot u}\right)

Alternative 1: 99.3% accurate, 0.9× speedup?

\[\left(0 \leq s \land s \leq 256\right) \land \left(2.328306437 \cdot 10^{-10} \leq u \land u \leq 0.25\right)\]
\[\left(-s\right) \cdot \mathsf{log1p}\left(\left(-4 \cdot u\right) \cdot 1\right) \]
(FPCore (s u)
  :precision binary32
  :pre (and (and (<= 0.0 s) (<= s 256.0))
     (and (<= 2.328306437e-10 u) (<= u 0.25)))
  (* (- s) (log1p (* (* -4.0 u) 1.0))))
float code(float s, float u) {
	return -s * log1pf(((-4.0f * u) * 1.0f));
}
function code(s, u)
	return Float32(Float32(-s) * log1p(Float32(Float32(Float32(-4.0) * u) * Float32(1.0))))
end
\left(-s\right) \cdot \mathsf{log1p}\left(\left(-4 \cdot u\right) \cdot 1\right)
Derivation
  1. Initial program 61.2%

    \[s \cdot \log \left(\frac{1}{1 - 4 \cdot u}\right) \]
  2. Step-by-step derivation
    1. Applied rewrites60.7%

      \[\leadsto s \cdot \log \left(\frac{1}{\frac{\mathsf{fma}\left(-4 \cdot u, \sqrt{2}, \sqrt{2}\right)}{\sqrt{2}}}\right) \]
    2. Taylor expanded in s around 0

      \[\leadsto s \cdot \log \left(\frac{\sqrt{2}}{\sqrt{2} + -4 \cdot \left(u \cdot \sqrt{2}\right)}\right) \]
    3. Step-by-step derivation
      1. Applied rewrites61.0%

        \[\leadsto s \cdot \log \left(\frac{\sqrt{2}}{\sqrt{2} + -4 \cdot \left(u \cdot \sqrt{2}\right)}\right) \]
      2. Applied rewrites63.6%

        \[\leadsto \left(-s\right) \cdot \log \left(\mathsf{fma}\left(-4, u, 1\right)\right) \]
      3. Step-by-step derivation
        1. Applied rewrites99.3%

          \[\leadsto \left(-s\right) \cdot \mathsf{log1p}\left(\left(-4 \cdot u\right) \cdot 1\right) \]
        2. Add Preprocessing

        Alternative 2: 98.4% accurate, 0.7× speedup?

        \[\left(0 \leq s \land s \leq 256\right) \land \left(2.328306437 \cdot 10^{-10} \leq u \land u \leq 0.25\right)\]
        \[\begin{array}{l} \mathbf{if}\;4 \cdot u \leq 0.01600000075995922:\\ \;\;\;\;s \cdot \mathsf{fma}\left(u \cdot u, \mathsf{fma}\left(21.333333333333332, u, 8\right), u \cdot 4\right)\\ \mathbf{else}:\\ \;\;\;\;\left(-s\right) \cdot \log \left(\mathsf{fma}\left(-4, u, 1\right)\right)\\ \end{array} \]
        (FPCore (s u)
          :precision binary32
          :pre (and (and (<= 0.0 s) (<= s 256.0))
             (and (<= 2.328306437e-10 u) (<= u 0.25)))
          (if (<= (* 4.0 u) 0.01600000075995922)
          (* s (fma (* u u) (fma 21.333333333333332 u 8.0) (* u 4.0)))
          (* (- s) (log (fma -4.0 u 1.0)))))
        float code(float s, float u) {
        	float tmp;
        	if ((4.0f * u) <= 0.01600000075995922f) {
        		tmp = s * fmaf((u * u), fmaf(21.333333333333332f, u, 8.0f), (u * 4.0f));
        	} else {
        		tmp = -s * logf(fmaf(-4.0f, u, 1.0f));
        	}
        	return tmp;
        }
        
        function code(s, u)
        	tmp = Float32(0.0)
        	if (Float32(Float32(4.0) * u) <= Float32(0.01600000075995922))
        		tmp = Float32(s * fma(Float32(u * u), fma(Float32(21.333333333333332), u, Float32(8.0)), Float32(u * Float32(4.0))));
        	else
        		tmp = Float32(Float32(-s) * log(fma(Float32(-4.0), u, Float32(1.0))));
        	end
        	return tmp
        end
        
        \begin{array}{l}
        \mathbf{if}\;4 \cdot u \leq 0.01600000075995922:\\
        \;\;\;\;s \cdot \mathsf{fma}\left(u \cdot u, \mathsf{fma}\left(21.333333333333332, u, 8\right), u \cdot 4\right)\\
        
        \mathbf{else}:\\
        \;\;\;\;\left(-s\right) \cdot \log \left(\mathsf{fma}\left(-4, u, 1\right)\right)\\
        
        
        \end{array}
        
        Derivation
        1. Split input into 2 regimes
        2. if (*.f32 #s(literal 4 binary32) u) < 0.0160000008

          1. Initial program 61.2%

            \[s \cdot \log \left(\frac{1}{1 - 4 \cdot u}\right) \]
          2. Taylor expanded in u around 0

            \[\leadsto s \cdot \left(u \cdot \left(4 + u \cdot \left(8 + \frac{64}{3} \cdot u\right)\right)\right) \]
          3. Step-by-step derivation
            1. Applied rewrites90.9%

              \[\leadsto s \cdot \left(u \cdot \left(4 + u \cdot \left(8 + 21.333333333333332 \cdot u\right)\right)\right) \]
            2. Step-by-step derivation
              1. Applied rewrites91.1%

                \[\leadsto s \cdot \mathsf{fma}\left(u \cdot u, \mathsf{fma}\left(21.333333333333332, u, 8\right), u \cdot 4\right) \]

              if 0.0160000008 < (*.f32 #s(literal 4 binary32) u)

              1. Initial program 61.2%

                \[s \cdot \log \left(\frac{1}{1 - 4 \cdot u}\right) \]
              2. Step-by-step derivation
                1. Applied rewrites60.7%

                  \[\leadsto s \cdot \log \left(\frac{1}{\frac{\mathsf{fma}\left(-4 \cdot u, \sqrt{2}, \sqrt{2}\right)}{\sqrt{2}}}\right) \]
                2. Taylor expanded in s around 0

                  \[\leadsto s \cdot \log \left(\frac{\sqrt{2}}{\sqrt{2} + -4 \cdot \left(u \cdot \sqrt{2}\right)}\right) \]
                3. Step-by-step derivation
                  1. Applied rewrites61.0%

                    \[\leadsto s \cdot \log \left(\frac{\sqrt{2}}{\sqrt{2} + -4 \cdot \left(u \cdot \sqrt{2}\right)}\right) \]
                  2. Applied rewrites63.6%

                    \[\leadsto \left(-s\right) \cdot \log \left(\mathsf{fma}\left(-4, u, 1\right)\right) \]
                4. Recombined 2 regimes into one program.
                5. Add Preprocessing

                Alternative 3: 98.1% accurate, 0.8× speedup?

                \[\left(0 \leq s \land s \leq 256\right) \land \left(2.328306437 \cdot 10^{-10} \leq u \land u \leq 0.25\right)\]
                \[\begin{array}{l} \mathbf{if}\;4 \cdot u \leq 0.01600000075995922:\\ \;\;\;\;s \cdot \left(\mathsf{fma}\left(\mathsf{fma}\left(21.333333333333332, u, 8\right), u, 4\right) \cdot u\right)\\ \mathbf{else}:\\ \;\;\;\;\left(-s\right) \cdot \log \left(\mathsf{fma}\left(-4, u, 1\right)\right)\\ \end{array} \]
                (FPCore (s u)
                  :precision binary32
                  :pre (and (and (<= 0.0 s) (<= s 256.0))
                     (and (<= 2.328306437e-10 u) (<= u 0.25)))
                  (if (<= (* 4.0 u) 0.01600000075995922)
                  (* s (* (fma (fma 21.333333333333332 u 8.0) u 4.0) u))
                  (* (- s) (log (fma -4.0 u 1.0)))))
                float code(float s, float u) {
                	float tmp;
                	if ((4.0f * u) <= 0.01600000075995922f) {
                		tmp = s * (fmaf(fmaf(21.333333333333332f, u, 8.0f), u, 4.0f) * u);
                	} else {
                		tmp = -s * logf(fmaf(-4.0f, u, 1.0f));
                	}
                	return tmp;
                }
                
                function code(s, u)
                	tmp = Float32(0.0)
                	if (Float32(Float32(4.0) * u) <= Float32(0.01600000075995922))
                		tmp = Float32(s * Float32(fma(fma(Float32(21.333333333333332), u, Float32(8.0)), u, Float32(4.0)) * u));
                	else
                		tmp = Float32(Float32(-s) * log(fma(Float32(-4.0), u, Float32(1.0))));
                	end
                	return tmp
                end
                
                \begin{array}{l}
                \mathbf{if}\;4 \cdot u \leq 0.01600000075995922:\\
                \;\;\;\;s \cdot \left(\mathsf{fma}\left(\mathsf{fma}\left(21.333333333333332, u, 8\right), u, 4\right) \cdot u\right)\\
                
                \mathbf{else}:\\
                \;\;\;\;\left(-s\right) \cdot \log \left(\mathsf{fma}\left(-4, u, 1\right)\right)\\
                
                
                \end{array}
                
                Derivation
                1. Split input into 2 regimes
                2. if (*.f32 #s(literal 4 binary32) u) < 0.0160000008

                  1. Initial program 61.2%

                    \[s \cdot \log \left(\frac{1}{1 - 4 \cdot u}\right) \]
                  2. Taylor expanded in u around 0

                    \[\leadsto s \cdot \left(u \cdot \left(4 + u \cdot \left(8 + \frac{64}{3} \cdot u\right)\right)\right) \]
                  3. Step-by-step derivation
                    1. Applied rewrites90.9%

                      \[\leadsto s \cdot \left(u \cdot \left(4 + u \cdot \left(8 + 21.333333333333332 \cdot u\right)\right)\right) \]
                    2. Applied rewrites90.9%

                      \[\leadsto s \cdot \left(\mathsf{fma}\left(\mathsf{fma}\left(21.333333333333332, u, 8\right), u, 4\right) \cdot u\right) \]

                    if 0.0160000008 < (*.f32 #s(literal 4 binary32) u)

                    1. Initial program 61.2%

                      \[s \cdot \log \left(\frac{1}{1 - 4 \cdot u}\right) \]
                    2. Step-by-step derivation
                      1. Applied rewrites60.7%

                        \[\leadsto s \cdot \log \left(\frac{1}{\frac{\mathsf{fma}\left(-4 \cdot u, \sqrt{2}, \sqrt{2}\right)}{\sqrt{2}}}\right) \]
                      2. Taylor expanded in s around 0

                        \[\leadsto s \cdot \log \left(\frac{\sqrt{2}}{\sqrt{2} + -4 \cdot \left(u \cdot \sqrt{2}\right)}\right) \]
                      3. Step-by-step derivation
                        1. Applied rewrites61.0%

                          \[\leadsto s \cdot \log \left(\frac{\sqrt{2}}{\sqrt{2} + -4 \cdot \left(u \cdot \sqrt{2}\right)}\right) \]
                        2. Applied rewrites63.6%

                          \[\leadsto \left(-s\right) \cdot \log \left(\mathsf{fma}\left(-4, u, 1\right)\right) \]
                      4. Recombined 2 regimes into one program.
                      5. Add Preprocessing

                      Alternative 4: 98.1% accurate, 0.8× speedup?

                      \[\left(0 \leq s \land s \leq 256\right) \land \left(2.328306437 \cdot 10^{-10} \leq u \land u \leq 0.25\right)\]
                      \[\begin{array}{l} \mathbf{if}\;4 \cdot u \leq 0.01600000075995922:\\ \;\;\;\;\left(\mathsf{fma}\left(\mathsf{fma}\left(21.333333333333332, u, 8\right), u, 4\right) \cdot s\right) \cdot u\\ \mathbf{else}:\\ \;\;\;\;\left(-s\right) \cdot \log \left(\mathsf{fma}\left(-4, u, 1\right)\right)\\ \end{array} \]
                      (FPCore (s u)
                        :precision binary32
                        :pre (and (and (<= 0.0 s) (<= s 256.0))
                           (and (<= 2.328306437e-10 u) (<= u 0.25)))
                        (if (<= (* 4.0 u) 0.01600000075995922)
                        (* (* (fma (fma 21.333333333333332 u 8.0) u 4.0) s) u)
                        (* (- s) (log (fma -4.0 u 1.0)))))
                      float code(float s, float u) {
                      	float tmp;
                      	if ((4.0f * u) <= 0.01600000075995922f) {
                      		tmp = (fmaf(fmaf(21.333333333333332f, u, 8.0f), u, 4.0f) * s) * u;
                      	} else {
                      		tmp = -s * logf(fmaf(-4.0f, u, 1.0f));
                      	}
                      	return tmp;
                      }
                      
                      function code(s, u)
                      	tmp = Float32(0.0)
                      	if (Float32(Float32(4.0) * u) <= Float32(0.01600000075995922))
                      		tmp = Float32(Float32(fma(fma(Float32(21.333333333333332), u, Float32(8.0)), u, Float32(4.0)) * s) * u);
                      	else
                      		tmp = Float32(Float32(-s) * log(fma(Float32(-4.0), u, Float32(1.0))));
                      	end
                      	return tmp
                      end
                      
                      \begin{array}{l}
                      \mathbf{if}\;4 \cdot u \leq 0.01600000075995922:\\
                      \;\;\;\;\left(\mathsf{fma}\left(\mathsf{fma}\left(21.333333333333332, u, 8\right), u, 4\right) \cdot s\right) \cdot u\\
                      
                      \mathbf{else}:\\
                      \;\;\;\;\left(-s\right) \cdot \log \left(\mathsf{fma}\left(-4, u, 1\right)\right)\\
                      
                      
                      \end{array}
                      
                      Derivation
                      1. Split input into 2 regimes
                      2. if (*.f32 #s(literal 4 binary32) u) < 0.0160000008

                        1. Initial program 61.2%

                          \[s \cdot \log \left(\frac{1}{1 - 4 \cdot u}\right) \]
                        2. Taylor expanded in u around 0

                          \[\leadsto u \cdot \left(4 \cdot s + u \cdot \left(8 \cdot s + \frac{64}{3} \cdot \left(s \cdot u\right)\right)\right) \]
                        3. Step-by-step derivation
                          1. Applied rewrites91.2%

                            \[\leadsto u \cdot \mathsf{fma}\left(4, s, u \cdot \mathsf{fma}\left(8, s, 21.333333333333332 \cdot \left(s \cdot u\right)\right)\right) \]
                          2. Taylor expanded in s around 0

                            \[\leadsto u \cdot \left(s \cdot \left(4 + u \cdot \left(8 + \frac{64}{3} \cdot u\right)\right)\right) \]
                          3. Step-by-step derivation
                            1. Applied rewrites90.9%

                              \[\leadsto u \cdot \left(s \cdot \left(4 + u \cdot \left(8 + 21.333333333333332 \cdot u\right)\right)\right) \]
                            2. Applied rewrites90.9%

                              \[\leadsto \left(\mathsf{fma}\left(\mathsf{fma}\left(21.333333333333332, u, 8\right), u, 4\right) \cdot s\right) \cdot u \]

                            if 0.0160000008 < (*.f32 #s(literal 4 binary32) u)

                            1. Initial program 61.2%

                              \[s \cdot \log \left(\frac{1}{1 - 4 \cdot u}\right) \]
                            2. Step-by-step derivation
                              1. Applied rewrites60.7%

                                \[\leadsto s \cdot \log \left(\frac{1}{\frac{\mathsf{fma}\left(-4 \cdot u, \sqrt{2}, \sqrt{2}\right)}{\sqrt{2}}}\right) \]
                              2. Taylor expanded in s around 0

                                \[\leadsto s \cdot \log \left(\frac{\sqrt{2}}{\sqrt{2} + -4 \cdot \left(u \cdot \sqrt{2}\right)}\right) \]
                              3. Step-by-step derivation
                                1. Applied rewrites61.0%

                                  \[\leadsto s \cdot \log \left(\frac{\sqrt{2}}{\sqrt{2} + -4 \cdot \left(u \cdot \sqrt{2}\right)}\right) \]
                                2. Applied rewrites63.6%

                                  \[\leadsto \left(-s\right) \cdot \log \left(\mathsf{fma}\left(-4, u, 1\right)\right) \]
                              4. Recombined 2 regimes into one program.
                              5. Add Preprocessing

                              Alternative 5: 97.2% accurate, 0.9× speedup?

                              \[\left(0 \leq s \land s \leq 256\right) \land \left(2.328306437 \cdot 10^{-10} \leq u \land u \leq 0.25\right)\]
                              \[\begin{array}{l} \mathbf{if}\;4 \cdot u \leq 0.005200000014156103:\\ \;\;\;\;s \cdot \frac{1}{-0.5 + \frac{0.25}{u}}\\ \mathbf{else}:\\ \;\;\;\;\left(-s\right) \cdot \log \left(\mathsf{fma}\left(-4, u, 1\right)\right)\\ \end{array} \]
                              (FPCore (s u)
                                :precision binary32
                                :pre (and (and (<= 0.0 s) (<= s 256.0))
                                   (and (<= 2.328306437e-10 u) (<= u 0.25)))
                                (if (<= (* 4.0 u) 0.005200000014156103)
                                (* s (/ 1.0 (+ -0.5 (/ 0.25 u))))
                                (* (- s) (log (fma -4.0 u 1.0)))))
                              float code(float s, float u) {
                              	float tmp;
                              	if ((4.0f * u) <= 0.005200000014156103f) {
                              		tmp = s * (1.0f / (-0.5f + (0.25f / u)));
                              	} else {
                              		tmp = -s * logf(fmaf(-4.0f, u, 1.0f));
                              	}
                              	return tmp;
                              }
                              
                              function code(s, u)
                              	tmp = Float32(0.0)
                              	if (Float32(Float32(4.0) * u) <= Float32(0.005200000014156103))
                              		tmp = Float32(s * Float32(Float32(1.0) / Float32(Float32(-0.5) + Float32(Float32(0.25) / u))));
                              	else
                              		tmp = Float32(Float32(-s) * log(fma(Float32(-4.0), u, Float32(1.0))));
                              	end
                              	return tmp
                              end
                              
                              \begin{array}{l}
                              \mathbf{if}\;4 \cdot u \leq 0.005200000014156103:\\
                              \;\;\;\;s \cdot \frac{1}{-0.5 + \frac{0.25}{u}}\\
                              
                              \mathbf{else}:\\
                              \;\;\;\;\left(-s\right) \cdot \log \left(\mathsf{fma}\left(-4, u, 1\right)\right)\\
                              
                              
                              \end{array}
                              
                              Derivation
                              1. Split input into 2 regimes
                              2. if (*.f32 #s(literal 4 binary32) u) < 0.00520000001

                                1. Initial program 61.2%

                                  \[s \cdot \log \left(\frac{1}{1 - 4 \cdot u}\right) \]
                                2. Step-by-step derivation
                                  1. Applied rewrites63.6%

                                    \[\leadsto s \cdot \frac{1}{\frac{2}{-2 \cdot \log \left(\mathsf{fma}\left(-4, u, 1\right)\right)}} \]
                                  2. Taylor expanded in u around 0

                                    \[\leadsto s \cdot \frac{1}{\frac{\frac{1}{4} + \frac{-1}{2} \cdot u}{u}} \]
                                  3. Step-by-step derivation
                                    1. Applied rewrites88.5%

                                      \[\leadsto s \cdot \frac{1}{\frac{0.25 + -0.5 \cdot u}{u}} \]
                                    2. Step-by-step derivation
                                      1. Applied rewrites88.5%

                                        \[\leadsto s \cdot \frac{1}{-0.5 + \frac{0.25}{u}} \]

                                      if 0.00520000001 < (*.f32 #s(literal 4 binary32) u)

                                      1. Initial program 61.2%

                                        \[s \cdot \log \left(\frac{1}{1 - 4 \cdot u}\right) \]
                                      2. Step-by-step derivation
                                        1. Applied rewrites60.7%

                                          \[\leadsto s \cdot \log \left(\frac{1}{\frac{\mathsf{fma}\left(-4 \cdot u, \sqrt{2}, \sqrt{2}\right)}{\sqrt{2}}}\right) \]
                                        2. Taylor expanded in s around 0

                                          \[\leadsto s \cdot \log \left(\frac{\sqrt{2}}{\sqrt{2} + -4 \cdot \left(u \cdot \sqrt{2}\right)}\right) \]
                                        3. Step-by-step derivation
                                          1. Applied rewrites61.0%

                                            \[\leadsto s \cdot \log \left(\frac{\sqrt{2}}{\sqrt{2} + -4 \cdot \left(u \cdot \sqrt{2}\right)}\right) \]
                                          2. Applied rewrites63.6%

                                            \[\leadsto \left(-s\right) \cdot \log \left(\mathsf{fma}\left(-4, u, 1\right)\right) \]
                                        4. Recombined 2 regimes into one program.
                                        5. Add Preprocessing

                                        Alternative 6: 88.5% accurate, 1.4× speedup?

                                        \[\left(0 \leq s \land s \leq 256\right) \land \left(2.328306437 \cdot 10^{-10} \leq u \land u \leq 0.25\right)\]
                                        \[s \cdot \frac{1}{-0.5 + \frac{0.25}{u}} \]
                                        (FPCore (s u)
                                          :precision binary32
                                          :pre (and (and (<= 0.0 s) (<= s 256.0))
                                             (and (<= 2.328306437e-10 u) (<= u 0.25)))
                                          (* s (/ 1.0 (+ -0.5 (/ 0.25 u)))))
                                        float code(float s, float u) {
                                        	return s * (1.0f / (-0.5f + (0.25f / u)));
                                        }
                                        
                                        real(4) function code(s, u)
                                        use fmin_fmax_functions
                                            real(4), intent (in) :: s
                                            real(4), intent (in) :: u
                                            code = s * (1.0e0 / ((-0.5e0) + (0.25e0 / u)))
                                        end function
                                        
                                        function code(s, u)
                                        	return Float32(s * Float32(Float32(1.0) / Float32(Float32(-0.5) + Float32(Float32(0.25) / u))))
                                        end
                                        
                                        function tmp = code(s, u)
                                        	tmp = s * (single(1.0) / (single(-0.5) + (single(0.25) / u)));
                                        end
                                        
                                        s \cdot \frac{1}{-0.5 + \frac{0.25}{u}}
                                        
                                        Derivation
                                        1. Initial program 61.2%

                                          \[s \cdot \log \left(\frac{1}{1 - 4 \cdot u}\right) \]
                                        2. Step-by-step derivation
                                          1. Applied rewrites63.6%

                                            \[\leadsto s \cdot \frac{1}{\frac{2}{-2 \cdot \log \left(\mathsf{fma}\left(-4, u, 1\right)\right)}} \]
                                          2. Taylor expanded in u around 0

                                            \[\leadsto s \cdot \frac{1}{\frac{\frac{1}{4} + \frac{-1}{2} \cdot u}{u}} \]
                                          3. Step-by-step derivation
                                            1. Applied rewrites88.5%

                                              \[\leadsto s \cdot \frac{1}{\frac{0.25 + -0.5 \cdot u}{u}} \]
                                            2. Step-by-step derivation
                                              1. Applied rewrites88.5%

                                                \[\leadsto s \cdot \frac{1}{-0.5 + \frac{0.25}{u}} \]
                                              2. Add Preprocessing

                                              Alternative 7: 86.6% accurate, 1.6× speedup?

                                              \[\left(0 \leq s \land s \leq 256\right) \land \left(2.328306437 \cdot 10^{-10} \leq u \land u \leq 0.25\right)\]
                                              \[s \cdot \left(\mathsf{fma}\left(8, u, 4\right) \cdot u\right) \]
                                              (FPCore (s u)
                                                :precision binary32
                                                :pre (and (and (<= 0.0 s) (<= s 256.0))
                                                   (and (<= 2.328306437e-10 u) (<= u 0.25)))
                                                (* s (* (fma 8.0 u 4.0) u)))
                                              float code(float s, float u) {
                                              	return s * (fmaf(8.0f, u, 4.0f) * u);
                                              }
                                              
                                              function code(s, u)
                                              	return Float32(s * Float32(fma(Float32(8.0), u, Float32(4.0)) * u))
                                              end
                                              
                                              s \cdot \left(\mathsf{fma}\left(8, u, 4\right) \cdot u\right)
                                              
                                              Derivation
                                              1. Initial program 61.2%

                                                \[s \cdot \log \left(\frac{1}{1 - 4 \cdot u}\right) \]
                                              2. Taylor expanded in u around 0

                                                \[\leadsto s \cdot \left(u \cdot \left(4 + 8 \cdot u\right)\right) \]
                                              3. Step-by-step derivation
                                                1. Applied rewrites86.6%

                                                  \[\leadsto s \cdot \left(u \cdot \left(4 + 8 \cdot u\right)\right) \]
                                                2. Step-by-step derivation
                                                  1. Applied rewrites86.6%

                                                    \[\leadsto s \cdot \left(\mathsf{fma}\left(8, u, 4\right) \cdot u\right) \]
                                                  2. Add Preprocessing

                                                  Alternative 8: 86.6% accurate, 1.6× speedup?

                                                  \[\left(0 \leq s \land s \leq 256\right) \land \left(2.328306437 \cdot 10^{-10} \leq u \land u \leq 0.25\right)\]
                                                  \[\left(\mathsf{fma}\left(8, u, 4\right) \cdot s\right) \cdot u \]
                                                  (FPCore (s u)
                                                    :precision binary32
                                                    :pre (and (and (<= 0.0 s) (<= s 256.0))
                                                       (and (<= 2.328306437e-10 u) (<= u 0.25)))
                                                    (* (* (fma 8.0 u 4.0) s) u))
                                                  float code(float s, float u) {
                                                  	return (fmaf(8.0f, u, 4.0f) * s) * u;
                                                  }
                                                  
                                                  function code(s, u)
                                                  	return Float32(Float32(fma(Float32(8.0), u, Float32(4.0)) * s) * u)
                                                  end
                                                  
                                                  \left(\mathsf{fma}\left(8, u, 4\right) \cdot s\right) \cdot u
                                                  
                                                  Derivation
                                                  1. Initial program 61.2%

                                                    \[s \cdot \log \left(\frac{1}{1 - 4 \cdot u}\right) \]
                                                  2. Taylor expanded in u around 0

                                                    \[\leadsto u \cdot \left(4 \cdot s + 8 \cdot \left(s \cdot u\right)\right) \]
                                                  3. Step-by-step derivation
                                                    1. Applied rewrites86.8%

                                                      \[\leadsto u \cdot \mathsf{fma}\left(4, s, 8 \cdot \left(s \cdot u\right)\right) \]
                                                    2. Taylor expanded in s around 0

                                                      \[\leadsto u \cdot \left(s \cdot \left(4 + 8 \cdot u\right)\right) \]
                                                    3. Step-by-step derivation
                                                      1. Applied rewrites86.6%

                                                        \[\leadsto u \cdot \left(s \cdot \left(4 + 8 \cdot u\right)\right) \]
                                                      2. Applied rewrites86.6%

                                                        \[\leadsto \left(\mathsf{fma}\left(8, u, 4\right) \cdot s\right) \cdot u \]
                                                      3. Add Preprocessing

                                                      Alternative 9: 74.0% accurate, 2.8× speedup?

                                                      \[\left(0 \leq s \land s \leq 256\right) \land \left(2.328306437 \cdot 10^{-10} \leq u \land u \leq 0.25\right)\]
                                                      \[s \cdot \left(u \cdot 4\right) \]
                                                      (FPCore (s u)
                                                        :precision binary32
                                                        :pre (and (and (<= 0.0 s) (<= s 256.0))
                                                           (and (<= 2.328306437e-10 u) (<= u 0.25)))
                                                        (* s (* u 4.0)))
                                                      float code(float s, float u) {
                                                      	return s * (u * 4.0f);
                                                      }
                                                      
                                                      real(4) function code(s, u)
                                                      use fmin_fmax_functions
                                                          real(4), intent (in) :: s
                                                          real(4), intent (in) :: u
                                                          code = s * (u * 4.0e0)
                                                      end function
                                                      
                                                      function code(s, u)
                                                      	return Float32(s * Float32(u * Float32(4.0)))
                                                      end
                                                      
                                                      function tmp = code(s, u)
                                                      	tmp = s * (u * single(4.0));
                                                      end
                                                      
                                                      s \cdot \left(u \cdot 4\right)
                                                      
                                                      Derivation
                                                      1. Initial program 61.2%

                                                        \[s \cdot \log \left(\frac{1}{1 - 4 \cdot u}\right) \]
                                                      2. Taylor expanded in u around 0

                                                        \[\leadsto s \cdot \left(u \cdot \left(4 + 8 \cdot u\right)\right) \]
                                                      3. Step-by-step derivation
                                                        1. Applied rewrites86.6%

                                                          \[\leadsto s \cdot \left(u \cdot \left(4 + 8 \cdot u\right)\right) \]
                                                        2. Taylor expanded in u around 0

                                                          \[\leadsto s \cdot \left(u \cdot 4\right) \]
                                                        3. Step-by-step derivation
                                                          1. Applied rewrites74.0%

                                                            \[\leadsto s \cdot \left(u \cdot 4\right) \]
                                                          2. Add Preprocessing

                                                          Alternative 10: 73.8% accurate, 2.8× speedup?

                                                          \[\left(0 \leq s \land s \leq 256\right) \land \left(2.328306437 \cdot 10^{-10} \leq u \land u \leq 0.25\right)\]
                                                          \[4 \cdot \left(s \cdot u\right) \]
                                                          (FPCore (s u)
                                                            :precision binary32
                                                            :pre (and (and (<= 0.0 s) (<= s 256.0))
                                                               (and (<= 2.328306437e-10 u) (<= u 0.25)))
                                                            (* 4.0 (* s u)))
                                                          float code(float s, float u) {
                                                          	return 4.0f * (s * u);
                                                          }
                                                          
                                                          real(4) function code(s, u)
                                                          use fmin_fmax_functions
                                                              real(4), intent (in) :: s
                                                              real(4), intent (in) :: u
                                                              code = 4.0e0 * (s * u)
                                                          end function
                                                          
                                                          function code(s, u)
                                                          	return Float32(Float32(4.0) * Float32(s * u))
                                                          end
                                                          
                                                          function tmp = code(s, u)
                                                          	tmp = single(4.0) * (s * u);
                                                          end
                                                          
                                                          4 \cdot \left(s \cdot u\right)
                                                          
                                                          Derivation
                                                          1. Initial program 61.2%

                                                            \[s \cdot \log \left(\frac{1}{1 - 4 \cdot u}\right) \]
                                                          2. Taylor expanded in u around 0

                                                            \[\leadsto 4 \cdot \left(s \cdot u\right) \]
                                                          3. Step-by-step derivation
                                                            1. Applied rewrites73.8%

                                                              \[\leadsto 4 \cdot \left(s \cdot u\right) \]
                                                            2. Add Preprocessing

                                                            Reproduce

                                                            ?
                                                            herbie shell --seed 2026070 
                                                            (FPCore (s u)
                                                              :name "Disney BSSRDF, sample scattering profile, lower"
                                                              :precision binary32
                                                              :pre (and (and (<= 0.0 s) (<= s 256.0)) (and (<= 2.328306437e-10 u) (<= u 0.25)))
                                                              (* s (log (/ 1.0 (- 1.0 (* 4.0 u))))))