Feature or enhancement
Currently we lack an effective peephole optimizer for uops traces.
We do some ad-hoc removal of LOAD/POP pairs, but many enhancements are missed.
For example, this silly program:
def add(a, b):
return a + b
def loop(n=100_000):
t = 0
for _ in range(n):
t += add(1,2)
if __name__ == "__main__":
loop()
generates this trace:
which contains the sequence:
_LOAD_FAST_BORROW 0
_LOAD_FAST_BORROW_1
_LOAD_CONST_INLINE_BORROW K
_RROT_3
_POP_TOP_NOP
_POP_TOP_NOP
which should be reduced to
_LOAD_CONST_INLINE_BORROW K
Feature or enhancement
Currently we lack an effective peephole optimizer for uops traces.
We do some ad-hoc removal of
LOAD/POPpairs, but many enhancements are missed.For example, this silly program:
generates this trace:
which contains the sequence:
which should be reduced to