Time bar (total: 5.1s)
| 1× | search |
| True | Other | False | Iter |
|---|---|---|---|
| 0% | 0.2% | 99.8% | 0 |
| 0% | 0.2% | 99.8% | 1 |
| 0% | 0.2% | 99.8% | 2 |
| 0.1% | 0.1% | 99.8% | 3 |
| 0.1% | 0.1% | 99.8% | 4 |
| 0.1% | 0% | 99.8% | 5 |
| 0.1% | 0% | 99.8% | 6 |
| 0.1% | 0% | 99.8% | 7 |
| 0.1% | 0% | 99.8% | 8 |
| 0.2% | 0% | 99.8% | 9 |
| 0.2% | 0% | 99.8% | 10 |
| 0.2% | 0% | 99.8% | 11 |
| 0.2% | 0% | 99.8% | 12 |
| 0.2% | 0% | 99.8% | 13 |
| 0.2% | 0% | 99.8% | 14 |
Compiled 26 to 19 computations (26.9% saved)
| 1.9s | 8256× | body | 128 | valid |
Compiled 63 to 46 computations (27% saved)
| 1× | egg-herbie |
| 566× | fma-def_binary32 |
| 230× | fma-neg_binary32 |
| 88× | cancel-sign-sub-inv_binary32 |
| 79× | distribute-rgt-in_binary32 |
| 67× | distribute-lft-in_binary32 |
Useful iterations: 1 (0.0ms)
| Iter | Nodes | Cost |
|---|---|---|
| 0 | 9 | 14 |
| 1 | 22 | 13 |
| 2 | 43 | 13 |
| 3 | 77 | 13 |
| 4 | 93 | 13 |
| 5 | 131 | 13 |
| 6 | 170 | 13 |
| 7 | 217 | 13 |
| 8 | 340 | 13 |
| 9 | 471 | 13 |
| 10 | 481 | 13 |
| 11 | 557 | 13 |
| 12 | 569 | 13 |
| 13 | 676 | 13 |
| 14 | 699 | 13 |
| 15 | 700 | 13 |
| 16 | 645 | 13 |
1 alts after pruning (1 fresh and 0 done)
| Pruned | Kept | Total | |
|---|---|---|---|
| New | 1 | 1 | 2 |
| Fresh | 1 | 0 | 1 |
| Picked | 0 | 0 | 0 |
| Done | 0 | 0 | 0 |
| Total | 2 | 1 | 3 |
| Status | Error | Program |
| ▶ | 0.3b | (*.f32 (*.f32 alpha (neg.f32 alpha)) (log1p.f32 (neg.f32 u0))) |
Compiled 39 to 26 computations (33.3% saved)
Found 1 expressions with local error:
| New | Error | Program |
| ✓ | 0.3b | (*.f32 (*.f32 alpha (neg.f32 alpha)) (log1p.f32 (neg.f32 u0))) |
1 calls:
| 216.0ms | (*.f32 (*.f32 alpha (neg.f32 alpha)) (log1p.f32 (neg.f32 u0))) |
| 1× | batch-egg-rewrite |
| 522× | expm1-log1p-u_binary32 |
| 521× | log1p-expm1-u_binary32 |
| 447× | unpow-prod-down_binary32 |
| 318× | log-prod_binary32 |
| 153× | pow2_binary32 |
1 calls:
| 40.0ms | (*.f32 (*.f32 alpha (neg.f32 alpha)) (log1p.f32 (neg.f32 u0))) |
Useful iterations: 1 (0.0ms)
| Iter | Nodes | Cost |
|---|---|---|
| 0 | 8 | 13 |
| 1 | 163 | 9 |
| 2 | 1403 | 9 |
| 3 | 5265 | 9 |
| 1× | egg-herbie |
| 352× | times-frac_binary32 |
| 297× | fma-def_binary32 |
| 250× | associate-/l*_binary32 |
| 219× | distribute-rgt-out_binary32 |
| 205× | unswap-sqr_binary32 |
Useful iterations: 4 (0.0ms)
| Iter | Nodes | Cost |
|---|---|---|
| 0 | 59 | 530 |
| 1 | 171 | 498 |
| 2 | 501 | 448 |
| 3 | 1692 | 434 |
| 4 | 4528 | 432 |
| 5 | 4729 | 432 |
| 6 | 5008 | 432 |
7 alts after pruning (6 fresh and 1 done)
| Pruned | Kept | Total | |
|---|---|---|---|
| New | 41 | 6 | 47 |
| Fresh | 0 | 0 | 0 |
| Picked | 0 | 1 | 1 |
| Done | 0 | 0 | 0 |
| Total | 41 | 7 | 48 |
| Status | Error | Program |
| 8.8b | (pow.f32 (cbrt.f32 (*.f32 (*.f32 alpha alpha) (log1p.f32 u0))) 3) | |
| ✓ | 0.3b | (*.f32 (*.f32 alpha (neg.f32 alpha)) (log1p.f32 (neg.f32 u0))) |
| ▶ | 2.2b | (*.f32 (*.f32 alpha alpha) (+.f32 (fma.f32 1/4 (pow.f32 u0 4) u0) (*.f32 u0 (*.f32 u0 (+.f32 (*.f32 u0 1/3) 1/2))))) |
| 2.3b | (+.f32 (*.f32 1/3 (*.f32 (pow.f32 u0 3) (pow.f32 alpha 2))) (+.f32 (*.f32 1/2 (*.f32 (pow.f32 u0 2) (pow.f32 alpha 2))) (+.f32 (*.f32 1/4 (*.f32 (pow.f32 u0 4) (pow.f32 alpha 2))) (*.f32 u0 (pow.f32 alpha 2))))) | |
| 4.2b | (*.f32 alpha (*.f32 alpha (fma.f32 1/2 (*.f32 u0 u0) u0))) | |
| 9.4b | (exp.f32 (log.f32 (*.f32 (*.f32 alpha alpha) (log1p.f32 u0)))) | |
| 8.7b | (pow.f32 (*.f32 alpha (sqrt.f32 (log1p.f32 u0))) 2) |
Compiled 890 to 548 computations (38.4% saved)
Found 4 expressions with local error:
| New | Error | Program |
| ✓ | 0.2b | (*.f32 u0 (*.f32 u0 (+.f32 (*.f32 u0 1/3) 1/2))) |
| ✓ | 0.3b | (*.f32 u0 (+.f32 (*.f32 u0 1/3) 1/2)) |
| ✓ | 0.3b | (*.f32 (*.f32 alpha alpha) (+.f32 (fma.f32 1/4 (pow.f32 u0 4) u0) (*.f32 u0 (*.f32 u0 (+.f32 (*.f32 u0 1/3) 1/2))))) |
| ✓ | 0.4b | (*.f32 u0 1/3) |
4 calls:
| 35.0ms | (*.f32 (*.f32 alpha alpha) (+.f32 (fma.f32 1/4 (pow.f32 u0 4) u0) (*.f32 u0 (*.f32 u0 (+.f32 (*.f32 u0 1/3) 1/2))))) |
| 3.0ms | (*.f32 u0 (*.f32 u0 (+.f32 (*.f32 u0 1/3) 1/2))) |
| 2.0ms | (*.f32 u0 (+.f32 (*.f32 u0 1/3) 1/2)) |
| 2.0ms | (*.f32 u0 1/3) |
| 1× | batch-egg-rewrite |
| 300× | expm1-udef_binary32 |
| 300× | log1p-udef_binary32 |
| 171× | add-sqr-sqrt_binary32 |
| 170× | log1p-expm1-u_binary32 |
| 170× | expm1-log1p-u_binary32 |
4 calls:
| 75.0ms | (*.f32 u0 (*.f32 u0 (+.f32 (*.f32 u0 1/3) 1/2))) |
| 75.0ms | (*.f32 u0 (+.f32 (*.f32 u0 1/3) 1/2)) |
| 75.0ms | (*.f32 (*.f32 alpha alpha) (+.f32 (fma.f32 1/4 (pow.f32 u0 4) u0) (*.f32 u0 (*.f32 u0 (+.f32 (*.f32 u0 1/3) 1/2))))) |
| 75.0ms | (*.f32 u0 1/3) |
Useful iterations: 1 (0.0ms)
| Iter | Nodes | Cost |
|---|---|---|
| 0 | 16 | 50 |
| 1 | 362 | 48 |
| 2 | 4325 | 48 |
| 3 | 5043 | 48 |
| 1× | egg-herbie |
| 777× | *-commutative_binary32 |
| 694× | unswap-sqr_binary32 |
| 675× | sqr-pow_binary32 |
| 585× | distribute-rgt-out_binary32 |
| 363× | associate-+l+_binary32 |
Useful iterations: 3 (0.0ms)
| Iter | Nodes | Cost |
|---|---|---|
| 0 | 40 | 978 |
| 1 | 107 | 946 |
| 2 | 302 | 890 |
| 3 | 992 | 871 |
| 4 | 1631 | 871 |
| 5 | 2341 | 871 |
| 6 | 3073 | 871 |
| 7 | 3059 | 871 |
| 8 | 3150 | 871 |
| 9 | 3263 | 871 |
| 10 | 3405 | 871 |
| 11 | 3242 | 871 |
| 12 | 3420 | 871 |
| 13 | 3726 | 871 |
| 14 | 4024 | 871 |
| 15 | 4436 | 871 |
| 16 | 4912 | 871 |
| 17 | 4908 | 871 |
6 alts after pruning (5 fresh and 1 done)
| Pruned | Kept | Total | |
|---|---|---|---|
| New | 112 | 5 | 117 |
| Fresh | 5 | 0 | 5 |
| Picked | 1 | 0 | 1 |
| Done | 0 | 1 | 1 |
| Total | 118 | 6 | 124 |
| Status | Error | Program |
| 2.4b | (/.f32 (*.f32 (+.f32 (pow.f32 (fma.f32 1/4 (pow.f32 u0 4) u0) 3) (pow.f32 (*.f32 u0 (*.f32 u0 (fma.f32 u0 1/3 1/2))) 3)) (*.f32 alpha alpha)) (+.f32 (pow.f32 (fma.f32 1/4 (pow.f32 u0 4) u0) 2) (*.f32 (*.f32 u0 (*.f32 u0 (fma.f32 u0 1/3 1/2))) (-.f32 (*.f32 u0 (*.f32 u0 (fma.f32 u0 1/3 1/2))) (fma.f32 1/4 (pow.f32 u0 4) u0))))) | |
| 3.0b | (*.f32 u0 (*.f32 (*.f32 alpha alpha) (+.f32 (*.f32 u0 (+.f32 (*.f32 1/3 u0) 1/2)) 1))) | |
| ✓ | 0.3b | (*.f32 (*.f32 alpha (neg.f32 alpha)) (log1p.f32 (neg.f32 u0))) |
| ▶ | 2.2b | (fma.f32 (*.f32 alpha alpha) (fma.f32 1/4 (pow.f32 u0 4) u0) (*.f32 (*.f32 alpha alpha) (*.f32 u0 (*.f32 u0 (fma.f32 u0 1/3 1/2))))) |
| 2.5b | (pow.f32 (*.f32 alpha (sqrt.f32 (fma.f32 u0 (*.f32 u0 (fma.f32 u0 1/3 1/2)) (fma.f32 1/4 (pow.f32 u0 4) u0)))) 2) | |
| 2.2b | (sqrt.f32 (*.f32 (pow.f32 alpha 4) (pow.f32 (fma.f32 u0 (*.f32 u0 (fma.f32 u0 1/3 1/2)) (fma.f32 1/4 (pow.f32 u0 4) u0)) 2))) |
Compiled 3090 to 1951 computations (36.9% saved)
Found 4 expressions with local error:
| New | Error | Program |
| ✓ | 0.2b | (*.f32 u0 (*.f32 u0 (fma.f32 u0 1/3 1/2))) |
| ✓ | 0.3b | (fma.f32 (*.f32 alpha alpha) (fma.f32 1/4 (pow.f32 u0 4) u0) (*.f32 (*.f32 alpha alpha) (*.f32 u0 (*.f32 u0 (fma.f32 u0 1/3 1/2))))) |
| ✓ | 0.3b | (*.f32 u0 (fma.f32 u0 1/3 1/2)) |
| ✓ | 0.3b | (*.f32 (*.f32 alpha alpha) (*.f32 u0 (*.f32 u0 (fma.f32 u0 1/3 1/2)))) |
4 calls:
| 33.0ms | (fma.f32 (*.f32 alpha alpha) (fma.f32 1/4 (pow.f32 u0 4) u0) (*.f32 (*.f32 alpha alpha) (*.f32 u0 (*.f32 u0 (fma.f32 u0 1/3 1/2))))) |
| 24.0ms | (*.f32 (*.f32 alpha alpha) (*.f32 u0 (*.f32 u0 (fma.f32 u0 1/3 1/2)))) |
| 3.0ms | (*.f32 u0 (*.f32 u0 (fma.f32 u0 1/3 1/2))) |
| 2.0ms | (*.f32 u0 (fma.f32 u0 1/3 1/2)) |
| 1× | batch-egg-rewrite |
| 371× | prod-diff_binary32 |
| 311× | fma-udef_binary32 |
| 255× | expm1-udef_binary32 |
| 255× | log1p-udef_binary32 |
| 214× | fma-neg_binary32 |
4 calls:
| 126.0ms | (*.f32 u0 (*.f32 u0 (fma.f32 u0 1/3 1/2))) |
| 126.0ms | (fma.f32 (*.f32 alpha alpha) (fma.f32 1/4 (pow.f32 u0 4) u0) (*.f32 (*.f32 alpha alpha) (*.f32 u0 (*.f32 u0 (fma.f32 u0 1/3 1/2))))) |
| 126.0ms | (*.f32 u0 (fma.f32 u0 1/3 1/2)) |
| 126.0ms | (*.f32 (*.f32 alpha alpha) (*.f32 u0 (*.f32 u0 (fma.f32 u0 1/3 1/2)))) |
Useful iterations: 1 (0.0ms)
| Iter | Nodes | Cost |
|---|---|---|
| 0 | 15 | 67 |
| 1 | 312 | 62 |
| 2 | 3344 | 62 |
| 3 | 4896 | 62 |
| 4 | 4909 | 62 |
| 5 | 4837 | 62 |
| 1× | egg-herbie |
| 694× | unswap-sqr_binary32 |
| 507× | associate-+l+_binary32 |
| 443× | associate-+r+_binary32 |
| 405× | distribute-rgt-out_binary32 |
| 342× | *-commutative_binary32 |
Useful iterations: 3 (0.0ms)
| Iter | Nodes | Cost |
|---|---|---|
| 0 | 44 | 1326 |
| 1 | 118 | 1247 |
| 2 | 338 | 1166 |
| 3 | 1121 | 1164 |
| 4 | 1844 | 1164 |
| 5 | 3516 | 1164 |
| 6 | 4633 | 1164 |
| 7 | 4747 | 1164 |
| 8 | 4836 | 1164 |
| 9 | 4949 | 1164 |
| 10 | 4920 | 1164 |
7 alts after pruning (5 fresh and 2 done)
| Pruned | Kept | Total | |
|---|---|---|---|
| New | 105 | 1 | 106 |
| Fresh | 0 | 4 | 4 |
| Picked | 0 | 1 | 1 |
| Done | 0 | 1 | 1 |
| Total | 105 | 7 | 112 |
| Status | Error | Program |
| 2.4b | (/.f32 (*.f32 (+.f32 (pow.f32 (fma.f32 1/4 (pow.f32 u0 4) u0) 3) (pow.f32 (*.f32 u0 (*.f32 u0 (fma.f32 u0 1/3 1/2))) 3)) (*.f32 alpha alpha)) (+.f32 (pow.f32 (fma.f32 1/4 (pow.f32 u0 4) u0) 2) (*.f32 (*.f32 u0 (*.f32 u0 (fma.f32 u0 1/3 1/2))) (-.f32 (*.f32 u0 (*.f32 u0 (fma.f32 u0 1/3 1/2))) (fma.f32 1/4 (pow.f32 u0 4) u0))))) | |
| 3.0b | (*.f32 u0 (*.f32 (*.f32 alpha alpha) (+.f32 (*.f32 u0 (+.f32 (*.f32 1/3 u0) 1/2)) 1))) | |
| ✓ | 0.3b | (*.f32 (*.f32 alpha (neg.f32 alpha)) (log1p.f32 (neg.f32 u0))) |
| ✓ | 2.2b | (fma.f32 (*.f32 alpha alpha) (fma.f32 1/4 (pow.f32 u0 4) u0) (*.f32 (*.f32 alpha alpha) (*.f32 u0 (*.f32 u0 (fma.f32 u0 1/3 1/2))))) |
| 5.0b | (fma.f32 (*.f32 alpha alpha) (fma.f32 1/4 (pow.f32 u0 4) u0) (*.f32 (*.f32 alpha alpha) (log.f32 (pow.f32 (exp.f32 u0) (*.f32 u0 (fma.f32 u0 1/3 1/2)))))) | |
| 2.5b | (pow.f32 (*.f32 alpha (sqrt.f32 (fma.f32 u0 (*.f32 u0 (fma.f32 u0 1/3 1/2)) (fma.f32 1/4 (pow.f32 u0 4) u0)))) 2) | |
| ▶ | 2.2b | (sqrt.f32 (*.f32 (pow.f32 alpha 4) (pow.f32 (fma.f32 u0 (*.f32 u0 (fma.f32 u0 1/3 1/2)) (fma.f32 1/4 (pow.f32 u0 4) u0)) 2))) |
Compiled 2860 to 1752 computations (38.7% saved)
Found 4 expressions with local error:
| New | Error | Program |
| ✓ | 0.1b | (sqrt.f32 (*.f32 (pow.f32 alpha 4) (pow.f32 (fma.f32 u0 (*.f32 u0 (fma.f32 u0 1/3 1/2)) (fma.f32 1/4 (pow.f32 u0 4) u0)) 2))) |
| 0.3b | (*.f32 u0 (fma.f32 u0 1/3 1/2)) | |
| ✓ | 0.4b | (*.f32 (pow.f32 alpha 4) (pow.f32 (fma.f32 u0 (*.f32 u0 (fma.f32 u0 1/3 1/2)) (fma.f32 1/4 (pow.f32 u0 4) u0)) 2)) |
| ✓ | 0.4b | (pow.f32 (fma.f32 u0 (*.f32 u0 (fma.f32 u0 1/3 1/2)) (fma.f32 1/4 (pow.f32 u0 4) u0)) 2) |
3 calls:
| 56.0ms | (*.f32 (pow.f32 alpha 4) (pow.f32 (fma.f32 u0 (*.f32 u0 (fma.f32 u0 1/3 1/2)) (fma.f32 1/4 (pow.f32 u0 4) u0)) 2)) |
| 32.0ms | (sqrt.f32 (*.f32 (pow.f32 alpha 4) (pow.f32 (fma.f32 u0 (*.f32 u0 (fma.f32 u0 1/3 1/2)) (fma.f32 1/4 (pow.f32 u0 4) u0)) 2))) |
| 3.0ms | (pow.f32 (fma.f32 u0 (*.f32 u0 (fma.f32 u0 1/3 1/2)) (fma.f32 1/4 (pow.f32 u0 4) u0)) 2) |
| 1× | batch-egg-rewrite |
| 272× | expm1-udef_binary32 |
| 272× | log1p-udef_binary32 |
| 246× | log-pow_binary32 |
| 155× | add-sqr-sqrt_binary32 |
| 155× | log1p-expm1-u_binary32 |
3 calls:
| 105.0ms | (sqrt.f32 (*.f32 (pow.f32 alpha 4) (pow.f32 (fma.f32 u0 (*.f32 u0 (fma.f32 u0 1/3 1/2)) (fma.f32 1/4 (pow.f32 u0 4) u0)) 2))) |
| 105.0ms | (*.f32 (pow.f32 alpha 4) (pow.f32 (fma.f32 u0 (*.f32 u0 (fma.f32 u0 1/3 1/2)) (fma.f32 1/4 (pow.f32 u0 4) u0)) 2)) |
| 105.0ms | (pow.f32 (fma.f32 u0 (*.f32 u0 (fma.f32 u0 1/3 1/2)) (fma.f32 1/4 (pow.f32 u0 4) u0)) 2) |
Useful iterations: 1 (0.0ms)
| Iter | Nodes | Cost |
|---|---|---|
| 0 | 17 | 80 |
| 1 | 350 | 75 |
| 2 | 3453 | 75 |
| 3 | 4922 | 75 |
| 4 | 4807 | 75 |
| 1× | egg-herbie |
| 828× | fma-def_binary32 |
| 601× | distribute-rgt-out_binary32 |
| 570× | associate-*l*_binary32 |
| 437× | associate-*r*_binary32 |
| 304× | associate-+l+_binary32 |
Useful iterations: 3 (0.0ms)
| Iter | Nodes | Cost |
|---|---|---|
| 0 | 89 | 1618 |
| 1 | 246 | 1530 |
| 2 | 790 | 1430 |
| 3 | 3327 | 1428 |
| 4 | 4965 | 1428 |
| 5 | 4955 | 1428 |
7 alts after pruning (4 fresh and 3 done)
| Pruned | Kept | Total | |
|---|---|---|---|
| New | 135 | 0 | 135 |
| Fresh | 0 | 4 | 4 |
| Picked | 0 | 1 | 1 |
| Done | 0 | 2 | 2 |
| Total | 135 | 7 | 142 |
| Status | Error | Program |
| 2.4b | (/.f32 (*.f32 (+.f32 (pow.f32 (fma.f32 1/4 (pow.f32 u0 4) u0) 3) (pow.f32 (*.f32 u0 (*.f32 u0 (fma.f32 u0 1/3 1/2))) 3)) (*.f32 alpha alpha)) (+.f32 (pow.f32 (fma.f32 1/4 (pow.f32 u0 4) u0) 2) (*.f32 (*.f32 u0 (*.f32 u0 (fma.f32 u0 1/3 1/2))) (-.f32 (*.f32 u0 (*.f32 u0 (fma.f32 u0 1/3 1/2))) (fma.f32 1/4 (pow.f32 u0 4) u0))))) | |
| 3.0b | (*.f32 u0 (*.f32 (*.f32 alpha alpha) (+.f32 (*.f32 u0 (+.f32 (*.f32 1/3 u0) 1/2)) 1))) | |
| ✓ | 0.3b | (*.f32 (*.f32 alpha (neg.f32 alpha)) (log1p.f32 (neg.f32 u0))) |
| ✓ | 2.2b | (fma.f32 (*.f32 alpha alpha) (fma.f32 1/4 (pow.f32 u0 4) u0) (*.f32 (*.f32 alpha alpha) (*.f32 u0 (*.f32 u0 (fma.f32 u0 1/3 1/2))))) |
| 5.0b | (fma.f32 (*.f32 alpha alpha) (fma.f32 1/4 (pow.f32 u0 4) u0) (*.f32 (*.f32 alpha alpha) (log.f32 (pow.f32 (exp.f32 u0) (*.f32 u0 (fma.f32 u0 1/3 1/2)))))) | |
| 2.5b | (pow.f32 (*.f32 alpha (sqrt.f32 (fma.f32 u0 (*.f32 u0 (fma.f32 u0 1/3 1/2)) (fma.f32 1/4 (pow.f32 u0 4) u0)))) 2) | |
| ✓ | 2.2b | (sqrt.f32 (*.f32 (pow.f32 alpha 4) (pow.f32 (fma.f32 u0 (*.f32 u0 (fma.f32 u0 1/3 1/2)) (fma.f32 1/4 (pow.f32 u0 4) u0)) 2))) |
Compiled 3392 to 2427 computations (28.4% saved)
Total 0.2b remaining (75.3%)
Threshold costs 0.2b (75.3%)
Compiled 11786 to 8168 computations (30.7% saved)
| 1× | egg-herbie |
| 8× | *-commutative_binary32 |
| 6× | neg-sub0_binary32 |
| 6× | neg-mul-1_binary32 |
| 5× | +-commutative_binary32 |
| 5× | sub-neg_binary32 |
Useful iterations: 0 (0.0ms)
| Iter | Nodes | Cost |
|---|---|---|
| 0 | 8 | 13 |
| 1 | 18 | 13 |
| 2 | 28 | 13 |
| 3 | 36 | 13 |
| 4 | 41 | 13 |
| 5 | 44 | 13 |
| 6 | 45 | 13 |
| 7 | 45 | 13 |
Compiled 200 to 137 computations (31.5% saved)
Loading profile data...