Cranelift: remove de-morgan mid-end rules (speedup up to 39.64%) #12127
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This resolves #12106.
The following table shows the data backing this patch, where the removal of the canonicalizing de-morgan rules boosts execution and compilation performance for
keccakbenchmark. This is primarily because it confuses our e-graph based optimizer which selects the fastest expression among equivalent expressions collected by performing (bounded) equality saturation using the rules defined in ISLE. The selection happens during elaboration on each basic block where the cost model evaluates each expression by accumulating costs of instructions in each basic block. Each instruction has an assigned cost for its opcode. This can be confirmed by the following codes:wasmtime/cranelift/codegen/src/egraph/elaborate.rs
Lines 222 to 354 in 9dc6a7d
wasmtime/cranelift/codegen/src/egraph/cost.rs
Lines 73 to 173 in 9dc6a7d
However, in a large sequential function with only a single block, the cost can saturate to infinity, where our cost model cannot decide the fastest expression, ending up selecting suboptimal expressions.