Skip to content

Conversation

@ShaharNaveh
Copy link
Collaborator

@ShaharNaveh ShaharNaveh commented Nov 28, 2025

Implementing python/cpython#30902

Summary by CodeRabbit

  • Refactor
    • Optimized internal bytecode instruction set by consolidating stack manipulation operations, replacing redundant instruction variants with more efficient alternatives. These changes streamline compiler code generation and virtual machine execution while maintaining identical external behavior.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 28, 2025

Walkthrough

This PR removes four deprecated bytecode instruction variants (Rotate2, Rotate3, Duplicate, Duplicate2) from the instruction set and replaces all usages with newer equivalents (CopyItem and Swap). Changes span the code generator, bytecode definition, JIT testing, and VM frame execution layers.

Changes

Cohort / File(s) Summary
Bytecode instruction set consolidation
crates/compiler-core/src/bytecode.rs
Removed four instruction enum variants: Rotate2, Rotate3, Duplicate, Duplicate2. Updated stack effects and disassembly logic to remove handling for removed opcodes.
Code generator instruction replacement
crates/codegen/src/compile.rs
Replaced all occurrences of Instruction::Duplicate with Instruction::CopyItem (various indices) and Rotate2/Rotate3 with Swap instructions across function definitions, type parameters, pattern matching, class/closure setup, and try/except branches.
JIT and VM execution cleanup
crates/jit/tests/common.rs, crates/vm/src/frame.rs
Removed StackMachine handlers for Duplicate, Rotate2, Rotate3 (now unimplemented); commented out ExecutingFrame dispatch block for ToBool, Duplicate, Duplicate2; removed execute_rotate helper function.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45–75 minutes

  • Public API removal: Four instruction enum variants are no longer available; verify all downstream consumers and external references are accounted for.
  • Widespread internal replacement: Code generator changes span multiple areas (function compilation, pattern matching, class/type-parameter handling); each cluster requires verification that the CopyItem/Swap substitutions maintain equivalent stack semantics.
  • Runtime implications: Removed handlers in JIT test harness and VM frame will now panic on these opcodes; confirm no execution path can emit the old instructions.
  • Cross-crate consistency: Changes affect four separate crates; ensure bytecode definition, code generation, and execution all remain in sync.

Possibly related PRs

  • Refactor compile_function #6001: Modifies similar code paths in compiler/codegen/src/compile.rs for function/class/type-parameter compilation and stack-manipulation instruction emission sequences, likely a precursor or related refactoring of instruction semantics.

Poem

🐰 Four old opcodes hop away,
CopyItem and Swap take their place today!
Stack semantics, lean and bright,
The bytecode dances—what a sight! ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 69.23% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title directly and concisely describes the main change: removing Rotate* and Duplicate* instructions from the codebase, which is reflected across all modified files (bytecode.rs, compile.rs, frame.rs, jit/tests/common.rs).
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0e6e256 and bce1c66.

⛔ Files ignored due to path filters (1)
  • crates/codegen/src/snapshots/rustpython_codegen__compile__tests__nested_double_async_with.snap is excluded by !**/*.snap
📒 Files selected for processing (4)
  • crates/codegen/src/compile.rs (14 hunks)
  • crates/compiler-core/src/bytecode.rs (0 hunks)
  • crates/jit/tests/common.rs (0 hunks)
  • crates/vm/src/frame.rs (1 hunks)
💤 Files with no reviewable changes (2)
  • crates/compiler-core/src/bytecode.rs
  • crates/jit/tests/common.rs
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style in Rust code (cargo fmt to format)
Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints introduced by your changes
Follow Rust best practices for error handling and memory management

Files:

  • crates/codegen/src/compile.rs
  • crates/vm/src/frame.rs
**/src/**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • crates/codegen/src/compile.rs
  • crates/vm/src/frame.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Run snippets and cpython tests (windows-latest)
🔇 Additional comments (9)
crates/vm/src/frame.rs (2)

1080-1085: Good: ToBool handler correctly relocated.

The ToBool instruction handler is now active at this location with proper implementation (pop value, convert to bool, push result). The logic is sound.


722-731: I'll verify the review comment by checking the bytecode enum definition and frame.rs instruction handlers to confirm the concerns about removed instruction handlers.
<function_calls>

#!/bin/bash

Description: Verify bytecode enum variants and frame.rs handler consistency

echo "=== 1. Checking bytecode enum for Duplicate/Rotate variants ==="
fd bytecode.rs | head -5


</function_calls>

Let me locate the bytecode file first:
<function_calls>

#!/bin/bash

Find bytecode.rs file

fd -t f bytecode.rs


</function_calls>

Now let me search for the bytecode enum definition and check for the instruction variants:
<function_calls>

#!/bin/bash

Check bytecode enum for Duplicate, Duplicate2, Rotate variants

echo "=== Checking for Duplicate/Rotate variants in bytecode enum ==="
rg "enum Instruction" -A 100 | grep -E "(Duplicate|Rotate)" || echo "No matches for Duplicate/Rotate in enum"

echo -e "\n=== Frame.rs handler verification ==="

Check for CopyItem and Swap handlers

echo "Checking for CopyItem and Swap instruction handlers:"
rg "bytecode::Instruction::CopyItem|bytecode::Instruction::Swap" crates/vm/src/frame.rs

echo -e "\n=== Checking for catch-all match arm ==="
rg "match instruction" crates/vm/src/frame.rs -A 5 | head -20


</function_calls>

I need to execute the verification scripts to check the bytecode enum definition and frame.rs handlers. Let me do that:
<function_calls>

#!/bin/bash

First, find all bytecode-related files

echo "=== Finding bytecode files ==="
fd bytecode -t f

echo -e "\n=== Searching for enum Instruction definition ==="
rg "pub enum Instruction" -A 150 | head -200


</function_calls>

Let me try a more focused search:
<function_calls>

#!/bin/bash

echo "=== Search 1: Looking for instruction enum ==="
rg "enum Instruction" -t rust -A 2

echo -e "\n=== Search 2: Specific search for Duplicate variants ==="
rg "Duplicate|Rotate" -t rust -C 2

echo -e "\n=== Search 3: Check frame.rs around lines 722-731 ==="
sed -n '710,740p' crates/vm/src/frame.rs

echo -e "\n=== Search 4: Check frame.rs around lines 1080-1085 ==="
sed -n '1075,1090p' crates/vm/src/frame.rs


</function_calls>

Let me also verify the CopyItem and Swap handlers exist:
<function_calls>

#!/bin/bash

echo "=== Checking for CopyItem instruction handler ==="
rg "bytecode::Instruction::CopyItem" crates/vm/src/frame.rs -B 2 -A 15

echo -e "\n=== Checking for Swap instruction handler ==="
rg "bytecode::Instruction::Swap" crates/vm/src/frame.rs -B 2 -A 15

echo -e "\n=== Checking entire instruction match statement structure ==="
rg "match.*instruction" crates/vm/src/frame.rs -A 3


</function_calls>

crates/codegen/src/compile.rs (7)

1066-1067: LGTM: Correct replacement of Duplicate with CopyItem { index: 1 }.

The CopyItem { index: 1 } instruction correctly copies the top of stack, which matches the semantics of the removed Duplicate instruction.


4325-4326: Verify Duplicate2 replacement for augmented subscript assignment.

The pattern CopyItem { index: 2 } twice correctly replaces Duplicate2:

  • Stack before: [value, slice]
  • After first CopyItem { index: 2 }: [value, slice, value]
  • After second CopyItem { index: 2 }: [value, slice, value, slice]

This duplicates both operands for the subscript load while preserving the originals for the subsequent store operation.


4354-4360: LGTM: Correct Rotate3 replacement for augmented subscript store.

The combination of Swap { index: 3 } + Swap { index: 2 } correctly rearranges the stack:

  • Stack: [CONTAINER, SLICE, RESULT]
  • After Swap { index: 3 }: [RESULT, SLICE, CONTAINER]
  • After Swap { index: 2 }: [RESULT, CONTAINER, SLICE]

This correctly positions the operands for StoreSubscript. Similarly, the attribute case (line 4360) with Swap { index: 2 } correctly swaps [CONTAINER, RESULT] to [RESULT, CONTAINER] for StoreAttr.


2254-2260: LGTM: Docstring handling with CopyItem and Swap.

The pattern correctly duplicates the function object, loads the docstring, then swaps to position the docstring value for the StoreAttr operation:

  • After CopyItem { index: 1 }: [func, func]
  • After loading docstring: [func, func, doc]
  • After Swap { index: 2 }: [func, doc, func]
  • StoreAttr stores doc to func.__doc__, leaving [func]

1920-1921: LGTM: Type parameter compilation correctly uses CopyItem.

All three type parameter variants (TypeVar, ParamSpec, TypeVarTuple) consistently use CopyItem { index: 1 } to duplicate the type parameter value before storing it by name, ensuring it remains on the stack for building the final tuple.

Also applies to: 1946-1947, 1973-1974


3300-3303: Note: Pattern matching helper also uses the new CopyItem instruction.

This usage within pattern_helper_sequence_subscr is consistent with the rest of the changes and correctly duplicates the subject for element extraction.


4124-4125: I'll help you verify the chained comparison stack manipulation. Let me start by examining the instruction definitions and the context around the code in question.
<function_calls>


#!/bin/bash

Search for Swap and CopyItem instruction definitions

rg -n "CopyItem|Swap" crates/compiler-core/src/bytecode.rs -A 2 -B 1 | head -60


</function_calls>


#!/bin/bash

Get context around lines 4124-4125 in compile.rs

sed -n '4110,4140p' crates/codegen/src/compile.rs | cat -n


</function_calls>


#!/bin/bash

Search for Rotate2 and Duplicate instructions to understand the original operations

rg -n "Rotate2|Duplicate" crates/compiler-core/src/bytecode.rs -A 2 -B 1 | head -60


</function_calls>


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ShaharNaveh
Copy link
Collaborator Author

@youknowone When running this on my local machine I had to cleanup all *.pyc files before running some tests, as otherwise the VM crashes.
CPython doesn't have this issue as they are changing opcode IDs only for a version change (i.e. 3.12 -> 3.13). How do you think we should address this? Or it's not something worth investing time on as we don't change the opcodes often

@ShaharNaveh ShaharNaveh marked this pull request as ready for review November 28, 2025 15:18
Copy link
Member

@youknowone youknowone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@youknowone
Copy link
Member

CPython doesn't have this issue as they are changing opcode IDs only for a version change (i.e. 3.12 -> 3.13). How do you think we should address this? Or it's not something worth investing time on as we don't change the opcodes often

We'd better not to change it on same magic number, yep. But we didn't address it anywhere yet.

@youknowone youknowone merged commit fef9de2 into RustPython:main Nov 28, 2025
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants