-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Align f-string related bytecodes with 3.13
#6321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Align f-string related bytecodes with 3.13
#6321
Conversation
WalkthroughThis PR separates f-string conversion from formatting by introducing Changes
Sequence Diagram(s)sequenceDiagram
participant Compiler as Compiler (codegen)
participant Bytecode as Bytecode (instructions)
participant VM as VM (frame exec)
Note over Compiler,Bytecode: Compilation emits new instruction sequence for f-string
Compiler->>Bytecode: emit inner expression
Compiler->>Bytecode: emit ConvertValue { oparg }
alt format_spec present
Compiler->>Bytecode: emit FormatWithSpec (and spec elements)
else no format_spec
Compiler->>Bytecode: emit FormatSimple
end
Note over Bytecode,VM: Runtime executes emitted instructions
VM->>VM: evaluate inner expression (push value)
VM->>VM: ConvertValue (apply None/Str/Repr/Ascii) -> push converted value
VM->>VM: FormatSimple / FormatWithSpec -> format value (with or without spec) and push result
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
crates/compiler-core/src/bytecode.rs (1)
15-51: ConvertValueOparg and OpArgType mapping are coherent and safeThe new
ConvertValueOpargenum plus itsOpArgTypeimpl give an explicit, closed mapping 0→None, 1→Str, 2→Repr, 3→Ascii, which aligns with the intended FVC_* flags. This keeps decoding simple and matches howruff::ConversionFlagis bridged on the AST side.One minor polish:
fmt::Displaycurrently rendersNoneas an empty string, which makesCONVERT_VALUElines look a bit odd if aNoneoparg ever slipped through. Using something like"0 (none)"would be more self-describing, or asserting thatNoneis unreachable to catch bugs earlier.Also applies to: 53-65, 517-533
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
crates/codegen/src/snapshots/rustpython_codegen__compile__tests__nested_double_async_with.snapis excluded by!**/*.snap
📒 Files selected for processing (4)
crates/codegen/src/compile.rs(3 hunks)crates/compiler-core/src/bytecode.rs(5 hunks)crates/vm/src/frame.rs(2 hunks)crates/vm/src/stdlib/ast/other.rs(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs: Follow the default rustfmt code style by runningcargo fmtto format Rust code
Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass,pymodule,pyfunction, etc.) when implementing Python functionality in Rust
Files:
crates/vm/src/stdlib/ast/other.rscrates/compiler-core/src/bytecode.rscrates/codegen/src/compile.rscrates/vm/src/frame.rs
🧬 Code graph analysis (3)
crates/vm/src/stdlib/ast/other.rs (1)
crates/compiler-core/src/bytecode.rs (8)
from_op_arg(379-379)from_op_arg(386-388)from_op_arg(398-400)from_op_arg(501-503)from_op_arg(519-527)from_op_arg(979-981)from_op_arg(1343-1349)from_op_arg(1378-1381)
crates/codegen/src/compile.rs (1)
crates/stdlib/src/pystruct.rs (1)
format_spec(47-49)
crates/vm/src/frame.rs (1)
crates/vm/src/stdlib/builtins.rs (1)
ascii(68-72)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Run snippets and cpython tests (windows-latest)
- GitHub Check: Run snippets and cpython tests (macos-latest)
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
- GitHub Check: Check Rust code with clippy
- GitHub Check: Run rust tests (windows-latest)
🔇 Additional comments (6)
crates/vm/src/frame.rs (2)
1311-1327: New f-string formatting opcodes follow correct stack and formatting semantics
ConvertValuecleanly delegates toconvert_value, and bothFormatSimpleandFormatWithSpecusevm.formatin the expected way:
FormatSimplepops a single value and formats with an empty spec, so[..., value] -> [..., formatted].FormatWithSpecpops spec then value, giving[..., value, spec] -> [..., formatted], which aligns with CPython’s f-string stack behaviour.The stack effects and error propagation look consistent with surrounding opcode handlers; no changes needed from a correctness or style perspective.
2255-2271: Conversion-only step for f-strings matches !s/!r/!a semanticsThe refactored
convert_valuecorrectly:
- Pops the value, applies the selected
ConvertValueOparg(Str,Repr,Ascii,None), and- Pushes the converted result back on the stack.
Using
value.str(vm)?,value.repr(vm)?, andbuiltins::ascii(value, vm)?mirrors Python’s!s,!r, and!aconversions, and theNonecase preserves the original object for later formatting. This is a faithful and clear separation of conversion from formatting.crates/vm/src/stdlib/ast/other.rs (1)
15-25: ConversionFlag deserialization correctly delegates to ConvertValueOpargThe new
ast_from_objectpath cleanly reusesConvertValueOparg::from_op_argand then maps its variants back toruff::ConversionFlag, preserving the 0–3 encoding and existing error behavior for invalid values. No issues here.crates/compiler-core/src/bytecode.rs (1)
818-850: New ConvertValue/Format instructions have consistent semantics*The added
ConvertValue,FormatSimple, andFormatWithSpecvariants are wired up consistently:
- Stack effects match the documented behavior:
ConvertValue(pop value, push converted) → net 0.FormatSimple(pop value, push formatted) → net 0.FormatWithSpec(pop spec+value, push formatted) → net -1.- Disassembly prints
CONVERT_VALUEwith theConvertValueOpargdisplay andFORMAT_SIMPLE/FORMAT_WITH_SPECmnemonics, which are clear and match the new bytecode split.This all looks correct and in line with the instruction docs.
Also applies to: 1727-1730, 1964-1966
crates/codegen/src/compile.rs (2)
35-41: ConvertValueOparg import usage is appropriateBringing
ConvertValueOparginto scope alongsideInstructionand related bytecode types is consistent with its later use in f-string emission; no issues here.
5613-5685: F-string compilation now cleanly separates ConvertValue and Format steps*The new
compile_fstring_elementsflow looks sound:
ConversionFlagis mapped 1:1 toConvertValueOparg, andConvertValueis only emitted forStr/Repr/Ascii, leavingNoneto go straight into formatting.- Debug text is emitted as its own string element and bumps
element_count, so cases likef"{x=}"correctly produce two parts ("x="andrepr(x)).- The auto-
reprbehavior for debug text only kicks in when there is no explicit conversion and noformat_spec, which matches the intended=, debug-style semantics.- Formatting is always explicit:
- Without a spec:
FormatSimple(calls__format__("")).- With a spec: nested spec f-string is compiled to a string, then
FormatWithSpecis used, matching the new bytecode contract.element_countstill exactly matches the number of top-level string fragments contributed by literals/debug text/formatted expressions, so the finalBuildString { size: element_count }remains correct.This aligns with the new
ConvertValue/FormatSimple/FormatWithSpecinstruction set and keeps evaluation order and stack shape consistent with the comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
crates/compiler-core/src/bytecode.rs (1)
53-65: Consider making the unreachable None case more explicit.The
Displayimplementation returns an empty string forNonewith a comment indicating this case should never be reached. For better debugging and clarity, consider usingunreachable!()or a more descriptive message.impl fmt::Display for ConvertValueOparg { fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { let out = match self { Self::Str => "1 (str)", Self::Repr => "2 (repr)", Self::Ascii => "3 (ascii)", - // We should never reach this. `FVC_NONE` are being handled by `Instruction::FormatSimple` - Self::None => "", + Self::None => { + // FVC_NONE is handled by Instruction::FormatSimple; this should never be displayed + unreachable!("ConvertValueOparg::None should not be displayed; use FormatSimple instead") + } }; write!(f, "{out}") } }
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
crates/compiler-core/src/bytecode.rs(4 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs: Follow the default rustfmt code style by runningcargo fmtto format Rust code
Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass,pymodule,pyfunction, etc.) when implementing Python functionality in Rust
Files:
crates/compiler-core/src/bytecode.rs
🔇 Additional comments (3)
crates/compiler-core/src/bytecode.rs (3)
820-852: LGTM! Clear documentation for the new instruction variants.The three new instructions (
ConvertValue,FormatSimple,FormatWithSpec) are well-documented with Python pseudocode showing their stack behavior. This separation of conversion and formatting logic aligns well with the PR objective of matching CPython 3.13's approach.
1729-1731: LGTM! Stack effects and display formatting are correct.The stack effects correctly reflect the behavior of each instruction:
ConvertValue: 0 (pops 1, pushes 1)FormatSimple: 0 (pops 1, pushes 1)FormatWithSpec: -1 (pops 2, pushes 1)The disassembly formatting appropriately uses the
ConvertValueOpargDisplay implementation which already provides readable output.Also applies to: 1966-1968
69-79: Clarify the purpose of accepting 255 infrom_op_argif it's specific to Ruff compatibility.CPython 3.13 encodes conversion flags as 0 (none), 1 (str), 2 (repr), and 3 (ascii)—no 255 mapping. The comment referencing "Ruff
ConversionFlag::Noneis-1i8" is accurate since RustPython and Ruff share parser/bytecode code (Ruff originated from RustPython's parser). However, clarify whether the0 | 255 => Self::Nonemapping is:
- Intentional cross-project compatibility for Ruff-generated bytecode, or
- Unnecessary defensive code that should be removed for CPython 3.13 alignment.
If intentional, add a clearer comment explaining the Ruff interop scenario. If unnecessary, simplify to
0 => Self::None.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
crates/compiler-core/src/bytecode.rs (1)
53-65: Consider clearer handling for the unreachable None variant.The comment states "We should never reach this" for
ConvertValueOparg::None, which makes sense sinceFormatSimplehandles cases without conversion. However, returning an empty string is confusing and could mask bugs. Consider usingunreachable!()or a debug assertion if this case truly should never occur during disassembly.impl fmt::Display for ConvertValueOparg { fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { let out = match self { Self::Str => "1 (str)", Self::Repr => "2 (repr)", Self::Ascii => "3 (ascii)", - // We should never reach this. `FVC_NONE` are being handled by `Instruction::FormatSimple` - Self::None => "", + Self::None => { + debug_assert!(false, "ConvertValueOparg::None should not be displayed; FormatSimple handles FVC_NONE"); + "" + } }; write!(f, "{out}") } }
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
crates/compiler-core/src/bytecode.rs(6 hunks)crates/vm/src/frame.rs(3 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs: Follow the default rustfmt code style by runningcargo fmtto format Rust code
Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass,pymodule,pyfunction, etc.) when implementing Python functionality in Rust
Files:
crates/vm/src/frame.rscrates/compiler-core/src/bytecode.rs
🧬 Code graph analysis (1)
crates/vm/src/frame.rs (1)
crates/vm/src/stdlib/builtins.rs (1)
ascii(68-72)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Run snippets and cpython tests (windows-latest)
- GitHub Check: Run tests under miri
- GitHub Check: Check Rust code with clippy
- GitHub Check: Run rust tests (windows-latest)
- GitHub Check: Run rust tests (macos-latest)
- GitHub Check: Ensure compilation on various targets
🔇 Additional comments (5)
crates/compiler-core/src/bytecode.rs (3)
663-723: LGTM! Clear documentation for new f-string instructions.The new instruction variants (
ConvertValue,FormatSimple,FormatWithSpec) are well-documented with clear stack operation descriptions and Python examples. The separation of conversion from formatting aligns with CPython 3.13's approach.
1723-1725: LGTM! Stack effects are correct.The stack effects for the new instructions are accurate:
ConvertValue: 0 (pop value, push converted value)FormatSimple: 0 (pop value, push formatted string)FormatWithSpec: -1 (pop spec and value, push formatted string)
67-85: Clarify whether dual-value support (0 and 255) is permanent or transitional.The implementation accepts both
0(CPython standard) and255(Ruff's -1i8 encoded as u32) asConvertValueOparg::None. The inline comment documents the Ruff encoding, but does not clarify whether this dual-value support is a permanent compatibility requirement or transitional migration code. If permanent, the current documentation is sufficient. If transitional, add a TODO comment with a target version or timeline for removal.crates/vm/src/frame.rs (2)
854-868: LGTM! Correct implementation of formatting instructions.The
FormatSimpleandFormatWithSpecinstruction handlers correctly implement Python's formatting semantics:
FormatSimpleformats with an empty format specFormatWithSpecpops and uses the provided format spec- Both properly call
vm.format()and manage the stack
2256-2272: Verify whenConvertValuewithConvertValueOparg::Noneis emitted.The
convert_valuemethod handlesConvertValueOparg::Noneby returning the value unchanged (line 2267). Based on the Display implementation comment in bytecode.rs,FormatSimpleshould handle cases without conversion, suggestingConvertValueshould only be emitted forStr,Repr, orAscii. Is theNonecase here defensive programming, or are there scenarios whereConvertValue { oparg: None }is actually emitted? IfConvertValueis never emitted withoparg: None, consider removing this match arm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
crates/vm/src/frame.rs(3 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs: Follow the default rustfmt code style by runningcargo fmtto format Rust code
Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass,pymodule,pyfunction, etc.) when implementing Python functionality in Rust
Files:
crates/vm/src/frame.rs
🧬 Code graph analysis (1)
crates/vm/src/frame.rs (1)
crates/vm/src/stdlib/builtins.rs (1)
ascii(68-72)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: Run snippets and cpython tests (macos-latest)
- GitHub Check: Run rust tests (windows-latest)
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
- GitHub Check: Check the WASM package and demo
- GitHub Check: Run snippets and cpython tests (windows-latest)
- GitHub Check: Run rust tests (macos-latest)
- GitHub Check: Run rust tests (ubuntu-latest)
- GitHub Check: Run tests under miri
- GitHub Check: Run snippets and cpython tests on wasm-wasi
- GitHub Check: Check Rust code with clippy
- GitHub Check: Ensure compilation on various targets
🔇 Additional comments (2)
crates/vm/src/frame.rs (2)
712-715: LGTM! ConvertValue instruction handler correctly delegates to helper function.The instruction handler follows the standard pattern and properly extracts the oparg before delegating to the
convert_valuehelper function.
2256-2272: LGTM! Conversion logic correctly implements all four conversion types.The function signature has been properly updated to use
ConvertValueOparg, and all conversion types (Str,Repr,Ascii,None) are correctly handled with appropriate error propagation.
| bytecode::Instruction::FormatWithSpec => { | ||
| let spec = self.pop_value(); | ||
| let value = self.pop_value(); | ||
| let formatted = vm.format(&value, spec.downcast::<PyStr>().unwrap())?; | ||
| self.push_value(formatted.into()); | ||
|
|
||
| Ok(None) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace .unwrap() with proper error handling.
The .downcast::<PyStr>().unwrap() call on Line 864 will panic if the format spec is not a PyStr. While the compiler may guarantee this, using .unwrap() without a safety comment violates Rust best practices for error handling.
Consider one of these approaches:
Option 1: Return a proper error
bytecode::Instruction::FormatWithSpec => {
let spec = self.pop_value();
let value = self.pop_value();
- let formatted = vm.format(&value, spec.downcast::<PyStr>().unwrap())?;
+ let spec_str = spec.downcast::<PyStr>().map_err(|_| {
+ vm.new_type_error("format spec must be a string")
+ })?;
+ let formatted = vm.format(&value, spec_str)?;
self.push_value(formatted.into());
Ok(None)
}Option 2: Use expect() if compiler guarantees spec is always PyStr
bytecode::Instruction::FormatWithSpec => {
let spec = self.pop_value();
let value = self.pop_value();
- let formatted = vm.format(&value, spec.downcast::<PyStr>().unwrap())?;
+ // SAFETY: compiler guarantees spec is always PyStr
+ let formatted = vm.format(&value, spec.downcast::<PyStr>().expect("format spec must be PyStr"))?;
self.push_value(formatted.into());
Ok(None)
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| bytecode::Instruction::FormatWithSpec => { | |
| let spec = self.pop_value(); | |
| let value = self.pop_value(); | |
| let formatted = vm.format(&value, spec.downcast::<PyStr>().unwrap())?; | |
| self.push_value(formatted.into()); | |
| Ok(None) | |
| } | |
| bytecode::Instruction::FormatWithSpec => { | |
| let spec = self.pop_value(); | |
| let value = self.pop_value(); | |
| let spec_str = spec.downcast::<PyStr>().map_err(|_| { | |
| vm.new_type_error("format spec must be a string") | |
| })?; | |
| let formatted = vm.format(&value, spec_str)?; | |
| self.push_value(formatted.into()); | |
| Ok(None) | |
| } |
🤖 Prompt for AI Agents
crates/vm/src/frame.rs around lines 861-868: the code calls
.downcast::<PyStr>().unwrap() which can panic; replace the unwrap with proper
handling — either attempt the downcast and on Err return a VM-level error (e.g.,
propagate a TypeError / conversion error with a clear message including the
actual type of spec) so the VM returns an error instead of panicking, or if you
truly have a compiler/logic guarantee that spec is always a PyStr, replace
unwrap() with expect("explain guarantee here") and add a short comment
documenting why the guarantee holds; implement the change consistently with
surrounding error propagation patterns in this file.
Implements python/cpython#6132
Summary by CodeRabbit
New Features
Bug Fixes
Refactor
✏️ Tip: You can customize this high-level summary in your review settings.