Global linear-scan register allocation for ZJIT SSA values by Copilot · Pull Request #7 · tekknolagi/ruby

Copilot · 2026-02-07T03:17:01Z

ZJIT’s register allocator was block-local, preventing SSA values from spanning dominating blocks despite a CFG. This change makes allocation global with CFG liveness/intervals and preserves existing split/out-of-SSA passes.

Global liveness + intervals
- Add CFG live-in/live-out dataflow and derive VReg live ranges across block boundaries.
- Extend BitSet with union/subtract/iteration utilities to support the analysis.
Out-of-SSA parameter handling
- Materialize explicit VReg block parameters for non-entry blocks.
- Use per-block parameters in branch ParallelMov expansion.
Validation + tests
- Drop block-local SSA validation from validate() while keeping the helper for tests.
- Add dominance-friendly validation and a cross-block allocator smoke test.

let param_opnd = gen_param(&mut asm, idx, is_entry_block);
if !is_entry_block {
    asm.current_block().add_parameter(param_opnd);
}
jit.opnds[insn_id.0] = Some(param_opnd);

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Co-authored-by: tekknolagi <401167+tekknolagi@users.noreply.github.com>

tekknolagi · 2026-02-07T05:00:22Z

gave you more internet access. have fun

Co-authored-by: tekknolagi <401167+tekknolagi@users.noreply.github.com>

Format differences from capstone: - movz/movn shown instead of mov alias - Immediates in hex (#0x7 vs #7) - Branch targets as relative decimal (+8 vs #0x10) - Condition on mnemonic (bne vs b.ne) - ldr/str with explicit #0 offset instead of ldur/stur - 32-bit ops use mnemonic suffix (addw) instead of w-prefix registers - Embedded data bytes show as "unknown" instead of fake instructions

- Small values (< 10) use decimal: #7, #0 - Larger values use hex: #0x20, #0x1000 - Signed negatives: #-8, #-0x10 - Branch conditions use b.cond format: b.ne, b.eq - Branch targets as absolute hex: #0x400 - Memory offsets use same decimal/hex convention - movk shift uses comma separator: , lsl Shopify#16 - All immediates have # prefix

Move compilation steps from the heaviest jobs to the lightest to reduce the critical path of the Compilations workflow. Before: jobs ranged from 13-41 min (compile#12 had 4 steps, compile#3 had 10 clang versions). After: jobs range from 7-9 steps each (excluding compile#1 which has the LTO build), bringing the estimated critical path from ~41 min to ~30 min. Moves: - clang 23, 22, 21 from #3 to Shopify#12 and Shopify#10 - GCC 8, 7 from #2 to Shopify#12 - `OPT_THREADED_CODE=1`, `OPT_THREADED_CODE=2` from #7 to Shopify#10

…ruby#17479) When we introduced the inliner we also added repeated passes of the optimization pipeline. The idea being that we want to optimize the results of inlining and, because we only inline one level deep, allow us to perform inlining on the result of the last inlining operation. The optimization loop would exit if we couldn't inline any more. If we could inline more, there's an upper bound that kicks us out of the loop so we don't try to inline the world. However, if we exited the loop by hitting that upper bound, we didn't end up specializing the results of the last inlining pass. This PR rectifies that. This is immediately visible in the 30k_methods benchmark, where performance roughly doubles. Before: ``` ❯ WARMUP_ITRS=0 MIN_BENCH_ITRS=10 MIN_BENCH_TIME=0 ./run_benchmarks.rb --chruby 'ruby-master --zjit-inline-threshold=30' 30k_methods Running benchmark "30k_methods" (1/1) + /Users/nirvdrum/.rubies/ruby-master/bin/ruby --zjit-inline-threshold\=30 -I harness /Users/nirvdrum/dev/worktrees/ruby-bench/main/benchmarks/30k_methods.rb ruby 4.1.0dev (2026-06-23T13:29:36Z master 13fe77d) +ZJIT dev +PRISM [arm64-darwin25] itr: time #1: 2689ms #2: 33ms #3: 32ms #4: 32ms #5: 32ms #6: 32ms #7: 32ms Shopify#8: 35ms Shopify#9: 33ms Shopify#10: 33ms ``` After: ``` ❯ WARMUP_ITRS=0 MIN_BENCH_ITRS=10 MIN_BENCH_TIME=0 ./run_benchmarks.rb --chruby 'ruby-zjit-opt-last-inline --zjit-inline-threshold=30' 30k_methods Running benchmark "30k_methods" (1/1) + /Users/nirvdrum/.rubies/ruby-zjit-opt-last-inline/bin/ruby --zjit-inline-threshold\=30 -I harness /Users/nirvdrum/dev/worktrees/ruby-bench/main/benchmarks/30k_methods.rb ruby 4.1.0dev (2026-06-25T13:56:41Z zjit-opt-last-inline 18ce64d) +ZJIT dev +PRISM [arm64-darwin25] itr: time #1: 2700ms #2: 17ms #3: 16ms #4: 16ms #5: 17ms #6: 16ms #7: 16ms Shopify#8: 17ms Shopify#9: 16ms Shopify#10: 16ms ``` Fixes Shopify#998.

Initial plan

9d94592

Copilot AI assigned Copilot and tekknolagi Feb 7, 2026

Copilot started work on behalf of tekknolagi February 7, 2026 03:17 View session

Implement global register allocator for zjit

c1ae8c3

Co-authored-by: tekknolagi <401167+tekknolagi@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Update register allocator to support global allocation~~ Implement global liveness-based register allocation for ZJIT Feb 7, 2026

Copilot AI requested a review from tekknolagi February 7, 2026 04:10

Copilot finished work on behalf of tekknolagi February 7, 2026 04:10

Copilot started work on behalf of tekknolagi February 7, 2026 05:00 View session

Expose live range computation to backends

24a44d7

Co-authored-by: tekknolagi <401167+tekknolagi@users.noreply.github.com>

Copilot AI changed the title ~~Implement global liveness-based register allocation for ZJIT~~ Global linear-scan register allocation for ZJIT Feb 7, 2026

Copilot finished work on behalf of tekknolagi February 7, 2026 05:06

Copilot started work on behalf of tekknolagi February 7, 2026 05:35 View session

Fix zjit test label usage

10116a7

Co-authored-by: tekknolagi <401167+tekknolagi@users.noreply.github.com>

Copilot AI changed the title ~~Global linear-scan register allocation for ZJIT~~ Global linear-scan register allocation for ZJIT SSA values Feb 7, 2026

Copilot finished work on behalf of tekknolagi February 7, 2026 05:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Global linear-scan register allocation for ZJIT SSA values#7

Global linear-scan register allocation for ZJIT SSA values#7
Copilot wants to merge 4 commits into
masterfrom
copilot/global-register-allocator

Copilot AI commented Feb 7, 2026 •

edited

Loading

Uh oh!

tekknolagi commented Feb 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Copilot AI commented Feb 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tekknolagi commented Feb 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Feb 7, 2026 •

edited

Loading