-
Notifications
You must be signed in to change notification settings - Fork 1.4k
update _pyio, test_fileio from 3.13.11 and impl more io features #6560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 WalkthroughWalkthroughAdds BlockingIOError handling and character-count tracking (PyOSError.written), EINTR-aware retry logic across IO syscalls, exposes block-size fields on FileIO/TextIOWrapper with Representable repr changes, and appends Unix st_blksize/st_blocks to os.stat results. Changes
Sequence Diagram(s)sequenceDiagram
actor User
participant Py as Python IO wrapper
participant Sys as OS syscall
participant Err as PyOSError
participant VM as VM Error Handler
User->>Py: read/write call (may include buffer)
Py->>Sys: perform syscall
alt syscall returns data / bytes_written
Sys-->>Py: success
Py-->>User: return result
else syscall returns EINTR
Sys-->>VM: errno = EINTR
Note over VM,Py: Trap EINTR (PEP 475) and retry
VM->>Py: retry syscall
Py->>Sys: perform syscall (retry)
else syscall returns EAGAIN / would block
Sys-->>VM: errno = EAGAIN
VM->>Err: construct BlockingIOError(errno=EAGAIN)
Note over Err: set `written` from syscall partial result (if provided)
VM-->>User: raise BlockingIOError with characters_written
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: Path: .coderabbit.yml Review profile: CHILL Plan: Pro ⛔ Files ignored due to path filters (6)
📒 Files selected for processing (5)
🧰 Additional context used📓 Path-based instructions (1)**/*.rs📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Files:
🧠 Learnings (4)📚 Learning: 2025-11-10T06:27:41.954ZApplied to files:
📚 Learning: 2025-08-26T05:20:54.540ZApplied to files:
📚 Learning: 2025-06-27T14:47:28.810ZApplied to files:
📚 Learning: 2025-06-27T14:47:28.810ZApplied to files:
🧬 Code graph analysis (2)crates/common/src/crt_fd.rs (2)
crates/vm/src/stdlib/io.rs (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
🔇 Additional comments (35)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
7d00ace to
fbb33c3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
crates/vm/src/exceptions.rs (1)
1728-1744: Verify CPython behavior for numeric third argument detection.The logic correctly handles the case where BlockingIOError's 3rd argument is the characters written count rather than a filename. The condition chain properly:
- Checks if it's BlockingIOError
- Verifies the arg is not None and is numeric
- Converts to
isizeand stores it- Clears the filename that was prematurely set in
py_newHowever, using
PyNumber::checkmay be overly broad (includes floats). Consider whethertry_indexalone would suffice since it handles the integer conversion.🔎 Simplified condition
// BlockingIOError's 3rd argument can be the number of characters written if is_blocking_io_error && !vm.is_none(third_arg) - && crate::protocol::PyNumber::check(third_arg) && let Ok(written) = third_arg.try_index(vm) && let Ok(n) = written.try_to_primitive::<isize>(vm) {crates/vm/src/stdlib/io.rs (1)
3520-3558: Potential infinite recursion without ReprGuard.The
repr_strimplementation callsname.repr(vm)andmode.repr(vm)(lines 3536, 3544) without usingReprGuard. If thenameormodeobject has a__repr__that accesses thisTextIOWrapper(e.g., through a circular reference), this could cause infinite recursion.Compare with
repr_file_obj_name(line 1538) which usesReprGuard::enter(vm, obj)to prevent reentrant repr calls.🔎 Proposed fix to add ReprGuard
impl Representable for TextIOWrapper { #[inline] fn repr_str(zelf: &Py<Self>, vm: &VirtualMachine) -> PyResult<String> { let type_name = zelf.class().slot_name(); + let Some(_guard) = ReprGuard::enter(vm, zelf.as_object()) else { + return Ok(format!("<{type_name}>")); + }; let Some(data) = zelf.data.lock() else { // Reentrant call return Ok(format!("<{type_name}>")); };
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (6)
Lib/_pyio.pyis excluded by!Lib/**Lib/test/test_file_eintr.pyis excluded by!Lib/**Lib/test/test_fileio.pyis excluded by!Lib/**Lib/test/test_io.pyis excluded by!Lib/**Lib/test/test_subprocess.pyis excluded by!Lib/**Lib/test/test_tarfile.pyis excluded by!Lib/**
📒 Files selected for processing (3)
crates/vm/src/exceptions.rscrates/vm/src/stdlib/io.rscrates/vm/src/stdlib/os.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs: Follow the default rustfmt code style by runningcargo fmtto format Rust code
Always run clippy to lint code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass,pymodule,pyfunction, etc.) when implementing Python functionality in Rust
Files:
crates/vm/src/stdlib/os.rscrates/vm/src/exceptions.rscrates/vm/src/stdlib/io.rs
🧠 Learnings (3)
📚 Learning: 2025-06-27T14:47:28.810Z
Learnt from: moreal
Repo: RustPython/RustPython PR: 5847
File: vm/src/stdlib/stat.rs:547-567
Timestamp: 2025-06-27T14:47:28.810Z
Learning: In RustPython's stat module implementation, platform-specific constants like SF_SUPPORTED and SF_SYNTHETIC should be conditionally declared only for the platforms where they're available (e.g., macOS), following CPython's approach of optional declaration rather than providing fallback values for other platforms.
Applied to files:
crates/vm/src/stdlib/os.rs
📚 Learning: 2025-06-27T14:47:28.810Z
Learnt from: moreal
Repo: RustPython/RustPython PR: 5847
File: vm/src/stdlib/stat.rs:547-567
Timestamp: 2025-06-27T14:47:28.810Z
Learning: In RustPython's stat module implementation, platform-specific constants like SF_SUPPORTED and SF_SYNTHETIC should be conditionally declared only for the platforms where they're available (e.g., macOS), following CPython's approach of optional declaration using #ifdef checks rather than providing fallback values for other platforms.
Applied to files:
crates/vm/src/stdlib/os.rs
📚 Learning: 2025-11-10T06:27:41.954Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6243
File: vm/src/function/buffer.rs:123-129
Timestamp: 2025-11-10T06:27:41.954Z
Learning: In vm/src/function/buffer.rs, line 36 in the try_rw_bytes_like method intentionally uses TypeError (not BufferError) for the "buffer is not a read-write bytes-like object" error case, even though a similar error message in ArgMemoryBuffer::try_from_borrowed_object uses BufferError. The TypeError is the intended exception type for this specific code path.
Applied to files:
crates/vm/src/stdlib/io.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Run snippets and cpython tests (windows-latest)
🔇 Additional comments (23)
crates/vm/src/stdlib/os.rs (2)
896-902: LGTM!The Unix-specific stat fields
st_blksizeandst_blocksare correctly added with appropriate attributes. Using#[pystruct_sequence(skip)]ensures they're accessible as named attributes while not affecting tuple indexing.
955-959: LGTM!The platform-specific handling is correct:
- On Unix/WASI: reads actual values from the stat structure with appropriate conversion
- On Windows: defaults to
0as these fields aren't available- The
#[allow(clippy::useless_conversion)]is correctly placed for 32-bit platform compatibilitycrates/vm/src/exceptions.rs (3)
1595-1597: LGTM!Good design choice using
-1as a sentinel value for "not set". This allows the attribute to raiseAttributeErrorwhen accessed on exceptions where it wasn't initialized, matching CPython's behavior for BlockingIOError.
1967-1975: LGTM!The getter correctly returns the value when set, or raises
AttributeErrorwhen the sentinel value-1indicates the attribute was never initialized. This matches CPython's behavior for accessingcharacters_writtenon non-BlockingIOError exceptions.
1977-2006: The implementation correctly matches CPython behavior.The
-1value is CPython's internal sentinel indicating thecharacters_writtenattribute is unset. The getter raisesAttributeErrorwhen the value is-1, and deleting the attribute resets it to-1. CPython's setter allows storing-1(and any integer value) without validation—this is intentional design per Python issue 35504. The RustPython implementation here faithfully replicates this behavior, so no validation to reject-1is needed.crates/vm/src/stdlib/io.rs (18)
12-19: LGTM!The EAGAIN constant is correctly defined with platform-specific handling. The hardcoded value 11 is the standard POSIX value for EAGAIN, appropriate for WASM targets where libc may not be available.
216-243: LGTM!The
trap_eintrfunction correctly implements PEP 475 EINTR handling. The pattern of returningOk(None)to signal retry is clean, and the WASM fallback appropriately skips EINTR handling since signals aren't applicable there.
693-711: LGTM!Good addition of bounds validation for the
readintoreturn value. This ensures that invalid return values fromreadintoare caught early with a clear error message rather than causing silent data corruption or panics.
723-730: LGTM!Correct implementation of PEP 475 EINTR retry semantics for the
readallmethod.
944-957: LGTM!Correct BlockingIOError construction with the proper errno (EAGAIN), descriptive message, and characters_written field set to 0 since no bytes were written during this flush attempt.
1149-1158: LGTM!Correct BlockingIOError handling that properly tracks
characters_writtento include both the bytes successfully written to the raw stream and the bytes buffered. This allows callers to handle partial writes correctly.
1295-1329: LGTM!Good implementation of PEP 475 EINTR retry loops for both code paths in
raw_read. The buffer restoration on line 1311 ensures the buffer is properly returned even if an error occurs, preventing resource leaks.
1336-1348: LGTM!Excellent error chaining implementation. When the return value can't be converted to an integer, the code properly creates an OSError and chains the underlying TypeError as the
__cause__, providing better debugging context for developers.
1391-1398: LGTM!Consistent PEP 475 EINTR retry handling for the
read_allmethod.
1669-1672: LGTM!Good reordering to check writability before performing any work. This matches CPython's behavior and provides clearer error messages.
2506-2529: LGTM!Good defensive checks for embedded null characters in encoding and errors strings. These protect against potential C string issues when the values are passed to lower-level functions. Using
as_wtf8().as_bytes()for the errors check properly handles potential surrogate characters.
4674-4681: LGTM!Good addition of a runtime warning when a bool is used as a file descriptor. This matches CPython's behavior and helps users catch potential bugs where they accidentally pass a boolean instead of an actual file descriptor.
4732-4732: LGTM on fd ownership tracking.The
fd_is_ownvariable correctly tracks whether the fd was provided by the user or opened internally. This ensures user-provided fds are not inadvertently closed when errors occur during initialization (lines 4754-4755, 4768-4770, 4782-4784), which is the correct behavior matching CPython.Also applies to: 4753-4770
4866-4878: LGTM!Good improvement to check if the file is closed before returning the readable/writable status. This matches CPython behavior where these methods raise
ValueErroron closed files rather than returning stale mode information.
4921-4946: LGTM!Correct PEP 475 EINTR retry implementation for
FileIO::read. The loop properly callscheck_signals()on EINTR to allow Python signal handlers to run before retrying.
4966-4976: LGTM!Correct PEP 475 EINTR retry implementation for
FileIO::readinto, consistent with thereadmethod.
4796-4812: LGTM on FileIO repr implementation.The
repr_strmethod correctly usesslot_name()to get the dynamic type name, supporting subclasses properly. The representation format matches CPython's FileIO repr.
1224-1224: LGTM!Good defensive check to ensure the buffer is not empty before the modulo operation on line 1226, preventing a potential division by zero.
| // Check if this is BlockingIOError - need to handle characters_written | ||
| let is_blocking_io_error = | ||
| zelf.class() | ||
| .is(vm.ctx.exceptions.blocking_io_error.as_ref()); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for blocking_io_error usage in the codebase
rg -n "blocking_io_error" --type=rust -B2 -A2Repository: RustPython/RustPython
Length of output: 7432
🏁 Script executed:
#!/bin/bash
# Search for .is() method definition in the codebase
rg -n "\.is\(" --type=rust crates/vm/src | head -30Repository: RustPython/RustPython
Length of output: 2670
🏁 Script executed:
#!/bin/bash
# Search for fast_isinstance definition
rg -n "fn fast_isinstance" --type=rust -B2 -A5Repository: RustPython/RustPython
Length of output: 594
🏁 Script executed:
#!/bin/bash
# Look for PyType's .is() method definition
rg -n "fn is\(" --type=rust crates/vm/src/types -A5 | head -40Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
#!/bin/bash
# Search for .is() method definition
rg -n "fn is\(" --type=rust crates/vm/src/object -B2 -A8Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
#!/bin/bash
# Also search for .is() on Py type or PyType
rg -n "pub.*fn is\(" --type=rust -B2 -A8 | head -60Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
#!/bin/bash
# Search more broadly for is method
rg "fn is" --type=rust crates/vm/src | grep -E "(fn is\(|fn is<)" | head -20Repository: RustPython/RustPython
Length of output: 131
🏁 Script executed:
#!/bin/bash
# Check the PyType implementation
fd -e rs "type" crates/vm/src/object | xargs grep -l "impl.*PyType" | head -5Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
#!/bin/bash
# Look for .is method on Py type
rg -n "\.is\(" --type=rust crates/vm/src/obj | head -20Repository: RustPython/RustPython
Length of output: 122
🏁 Script executed:
#!/bin/bash
# Search in pytype.rs or similar
fd -e rs "pytype" crates/vm/srcRepository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
#!/bin/bash
# Get the full implementation of .is() method
rg -n "fn is<T>" --type=rust -B2 -A8 crates/vm/src/object/ext.rsRepository: RustPython/RustPython
Length of output: 313
🏁 Script executed:
#!/bin/bash
# Check what fast_issubclass does
rg -n "fn fast_issubclass" --type=rust -B2 -A10Repository: RustPython/RustPython
Length of output: 1021
🏁 Script executed:
#!/bin/bash
# Check if BlockingIOError has any subclasses defined in the codebase
rg -n "BlockingIOError" --type=rust | grep -i "subclass\|extends\|parent\|base"Repository: RustPython/RustPython
Length of output: 1562
🏁 Script executed:
#!/bin/bash
# Check if there are any Python exception subclass examples in the codebase
rg -n "class.*\(.*Error\)" --type=rust | head -10Repository: RustPython/RustPython
Length of output: 47
Use fast_isinstance for subclass support.
The current code uses .is() which performs identity comparison only and won't match subclasses of BlockingIOError. This means subclasses won't receive the special 3rd-argument handling for characters_written.
Replace .is(vm.ctx.exceptions.blocking_io_error.as_ref()) with fast_isinstance(vm.ctx.exceptions.blocking_io_error) to properly support subclasses, consistent with how crates/stdlib/src/ssl/compat.rs handles the same exception.
🤖 Prompt for AI Agents
In crates/vm/src/exceptions.rs around lines 1714 to 1718, the code uses identity
comparison zelf.class().is(vm.ctx.exceptions.blocking_io_error.as_ref()) which
does not match subclasses; replace this check with a call to
fast_isinstance(vm.ctx.exceptions.blocking_io_error) on the object to allow
subclass matches (mirror the pattern used in crates/stdlib/src/ssl/compat.rs).
Ensure you reference the correct fast_isinstance function in scope (import or
fully-qualify if necessary) and keep the subsequent characters_written handling
unchanged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/derive-impl/src/pystructseq.rs (1)
318-338: Apply cfg guards in try_from_elements field initialization.The
try_from_elementsimplementation collects field identifiers without their cfg attributes and uses them in struct initialization. If a field is conditionally compiled (e.g.,#[cfg(unix)]), the struct definition will change based on the target platform, but the field initialization intry_from_elementswill always reference all fields, causing compilation errors.🔎 Proposed fix
Modify the code to include cfg guards when generating field initializations:
let try_from_elements_trait_override = if try_from_object { - let visible_field_idents: Vec<_> = visible_fields.iter().map(|f| &f.ident).collect(); - let skipped_field_idents: Vec<_> = skipped_fields.iter().map(|f| &f.ident).collect(); + let visible_field_inits: Vec<_> = visible_fields + .iter() + .map(|f| { + let ident = &f.ident; + let cfg_attrs = &f.cfg_attrs; + if cfg_attrs.is_empty() { + quote! { #ident: iter.next().unwrap().clone().try_into_value(vm)?, } + } else { + quote! { + #(#cfg_attrs)* + #ident: iter.next().unwrap().clone().try_into_value(vm)?, + } + } + }) + .collect(); + let skipped_field_inits: Vec<_> = skipped_fields + .iter() + .map(|f| { + let ident = &f.ident; + let cfg_attrs = &f.cfg_attrs; + if cfg_attrs.is_empty() { + quote! { + #ident: match iter.next() { + Some(v) => v.clone().try_into_value(vm)?, + None => vm.ctx.none(), + }, + } + } else { + quote! { + #(#cfg_attrs)* + #ident: match iter.next() { + Some(v) => v.clone().try_into_value(vm)?, + None => vm.ctx.none(), + }, + } + } + }) + .collect(); quote! { fn try_from_elements( elements: Vec<::rustpython_vm::PyObjectRef>, vm: &::rustpython_vm::VirtualMachine, ) -> ::rustpython_vm::PyResult<Self> { let mut iter = elements.into_iter(); Ok(Self { - #(#visible_field_idents: iter.next().unwrap().clone().try_into_value(vm)?,)* - #(#skipped_field_idents: match iter.next() { - Some(v) => v.clone().try_into_value(vm)?, - None => vm.ctx.none(), - },)* + #(#visible_field_inits)* + #(#skipped_field_inits)* }) } }
♻️ Duplicate comments (1)
crates/vm/src/exceptions.rs (1)
1714-1745: Duplicate issue: Usefast_isinstancefor BlockingIOError check.Lines 1715-1717 use
.is()which only matches exact type identity, not subclasses. This issue was already flagged in previous review comments and should usefast_isinstance(vm.ctx.exceptions.blocking_io_error)instead.The rest of the logic correctly handles the 3rd argument as either
characters_written(when it's a number) orfilename(otherwise), with appropriate state management.
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (6)
Lib/_pyio.pyis excluded by!Lib/**Lib/test/test_file_eintr.pyis excluded by!Lib/**Lib/test/test_fileio.pyis excluded by!Lib/**Lib/test/test_io.pyis excluded by!Lib/**Lib/test/test_subprocess.pyis excluded by!Lib/**Lib/test/test_tarfile.pyis excluded by!Lib/**
📒 Files selected for processing (5)
crates/common/src/crt_fd.rscrates/derive-impl/src/pystructseq.rscrates/vm/src/exceptions.rscrates/vm/src/stdlib/io.rscrates/vm/src/stdlib/os.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs: Follow the default rustfmt code style by runningcargo fmtto format Rust code
Always run clippy to lint code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass,pymodule,pyfunction, etc.) when implementing Python functionality in Rust
Files:
crates/common/src/crt_fd.rscrates/vm/src/exceptions.rscrates/derive-impl/src/pystructseq.rscrates/vm/src/stdlib/os.rscrates/vm/src/stdlib/io.rs
🧠 Learnings (4)
📚 Learning: 2025-11-10T06:27:41.954Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6243
File: vm/src/function/buffer.rs:123-129
Timestamp: 2025-11-10T06:27:41.954Z
Learning: In vm/src/function/buffer.rs, line 36 in the try_rw_bytes_like method intentionally uses TypeError (not BufferError) for the "buffer is not a read-write bytes-like object" error case, even though a similar error message in ArgMemoryBuffer::try_from_borrowed_object uses BufferError. The TypeError is the intended exception type for this specific code path.
Applied to files:
crates/vm/src/exceptions.rscrates/vm/src/stdlib/io.rs
📚 Learning: 2025-08-26T05:20:54.540Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.540Z
Learning: In the RustPython codebase, only certain builtin types should be marked with the SEQUENCE flag for pattern matching. List and tuple are sequences, but bytes, bytearray, and range are not considered sequences in this context, even though they may implement sequence-like protocols.
Applied to files:
crates/derive-impl/src/pystructseq.rs
📚 Learning: 2025-06-27T14:47:28.810Z
Learnt from: moreal
Repo: RustPython/RustPython PR: 5847
File: vm/src/stdlib/stat.rs:547-567
Timestamp: 2025-06-27T14:47:28.810Z
Learning: In RustPython's stat module implementation, platform-specific constants like SF_SUPPORTED and SF_SYNTHETIC should be conditionally declared only for the platforms where they're available (e.g., macOS), following CPython's approach of optional declaration using #ifdef checks rather than providing fallback values for other platforms.
Applied to files:
crates/vm/src/stdlib/os.rs
📚 Learning: 2025-06-27T14:47:28.810Z
Learnt from: moreal
Repo: RustPython/RustPython PR: 5847
File: vm/src/stdlib/stat.rs:547-567
Timestamp: 2025-06-27T14:47:28.810Z
Learning: In RustPython's stat module implementation, platform-specific constants like SF_SUPPORTED and SF_SYNTHETIC should be conditionally declared only for the platforms where they're available (e.g., macOS), following CPython's approach of optional declaration rather than providing fallback values for other platforms.
Applied to files:
crates/vm/src/stdlib/os.rs
🧬 Code graph analysis (2)
crates/common/src/crt_fd.rs (2)
crates/vm/src/stdlib/os.rs (1)
ftruncate(1566-1568)crates/common/src/os.rs (1)
errno_to_winerror(100-124)
crates/vm/src/stdlib/io.rs (1)
crates/vm/src/exceptions.rs (6)
cstring_error(1172-1174)v(1994-1996)repr_str(738-742)zelf(1712-1712)new(547-555)new(1088-1090)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: Run rust tests (macos-latest)
- GitHub Check: Run snippets and cpython tests (windows-latest)
- GitHub Check: Run rust tests (ubuntu-latest)
- GitHub Check: Run rust tests (windows-latest)
- GitHub Check: Run snippets and cpython tests (macos-latest)
- GitHub Check: Check Rust code with clippy
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
- GitHub Check: Ensure compilation on various targets
- GitHub Check: Run tests under miri
- GitHub Check: Run snippets and cpython tests on wasm-wasi
- GitHub Check: Check the WASM package and demo
🔇 Additional comments (35)
crates/common/src/crt_fd.rs (1)
336-353: LGTM! Platform-specific error handling is correct.The refactored ftruncate implementation properly handles the difference between Windows
_chsize_s(which returns errno directly) and POSIXftruncate(which returns -1 and sets errno). The conversion from errno to Windows error code viaerrno_to_winerroris appropriate.crates/vm/src/stdlib/os.rs (3)
896-904: LGTM! Unix-specific stat fields properly declared.The new
st_blksizeandst_blocksfields are correctly declared with appropriate platform guards and attributes, consistent with CPython's stat result structure.
957-962: LGTM! Correct handling of 32-bit platform compatibility.The use of
i64::from()with the clippy allow is appropriate for ensuring compatibility across 32-bit and 64-bit platforms where the underlying stat field types may differ.
981-984: LGTM! Field initialization is correct.The initialization of
st_blksizeandst_blocksis properly guarded and consistent with the field declarations.crates/vm/src/exceptions.rs (4)
1595-1597: LGTM! Field declaration is appropriate.The
writtenfield usingAtomicCell<isize>with -1 as a sentinel value for "not set" follows established patterns and provides thread-safe access for BlockingIOError'scharacters_writtenattribute.
1672-1672: LGTM! Field initialization is correct.Initializing
writtento -1 inpy_newcorrectly establishes the "not set" state.
1967-1975: LGTM! Getter correctly implements Python attribute semantics.The
characters_writtengetter appropriately raisesAttributeErrorwhen the field is unset (-1) and returns the value otherwise, matching standard Python attribute behavior.
1977-2006: LGTM! Setter correctly handles assignment and deletion.The
set_characters_writtensetter properly implements Python's descriptor protocol:
- Deletion raises
AttributeErrorwhen already unset (correct behavior)- Assignment converts via
try_indexandtry_to_primitivewith appropriate error handling- Uses
ValueErrorfor conversion failurescrates/vm/src/stdlib/io.rs (22)
216-244: LGTM: Well-designed EINTR trap pattern.The
trap_eintrfunction correctly implements PEP 475 semantics by:
- Detecting OSError with errno == EINTR
- Calling
vm.check_signals()to handle pending signals- Returning
Ok(None)to indicate retry neededThe WASM no-op version is appropriate since EINTR is not relevant in that environment.
693-712: LGTM: Proper bounds validation added.The bounds check for
readintoreturn values prevents potential issues from invalid implementations. The validation ensures the returned size is within[0, size]and provides a clear error message on violation.
723-730: LGTM: Correct EINTR retry pattern.The EINTR-aware retry loop properly implements PEP 475 for the
readallmethod, ensuring interrupted reads are automatically retried after signal handling.
944-957: LGTM: Proper BlockingIOError construction.The BlockingIOError is correctly constructed with:
- EAGAIN errno
- Descriptive message
- characters_written attribute set to 0
This matches CPython's behavior for non-blocking write failures.
1149-1159: LGTM: Correct partial write tracking.The BlockingIOError construction properly tracks
characters_writtenaswritten + buffer_len, allowing callers to know exactly how much data was buffered before the operation would block.
1224-1234: Verify loop condition change.The
while remaining > 0 && !self.buffer.is_empty()condition at line 1224 appears to add a buffer emptiness check. Ensure this aligns with the intended logic - specifically that the loop should exit when the buffer is empty, even ifremaining > 0.Based on the surrounding context, this appears correct, but confirming the behavior matches CPython would be helpful.
1295-1331: LGTM: Excellent EINTR handling with proper cleanup.The EINTR-aware retry loops are correctly implemented in both branches of
raw_read. Particularly good is the comment and code ensuring the buffer is always restored, even when an error occurs.
1336-1349: Excellent error handling with cause chain.The error handling for invalid
readintoreturn values is well-designed:
- Attempts to convert to
isize, capturing conversion errors- Constructs an informative
OSErrorabout the invalid return- Chains the original
TypeErroras the causeThis provides clear error context for debugging.
1391-1398: LGTM: Consistent EINTR handling.The EINTR retry loop in
read_allis consistent with the pattern used throughout this file.
1669-1672: LGTM: Proper writable check for truncate.The check ensures
truncateis only called on writable files, preventing invalid operations with a clear error message.
2506-2529: LGTM: Essential null character validation.The embedded null character checks prevent issues when these strings are used with C APIs:
- Line 2508: Checks encoding string for
\0- Line 2527: Checks errors string using
as_wtf8().as_bytes().contains(&0)to handle surrogates correctlyThis is a critical security/correctness measure.
2635-2643: LGTM: Representable trait addition.Adding
Representableto thewithclause enables the customrepr_strimplementation forTextIOWrapper.
3520-3558: Excellent repr implementation with robust error handling.The
repr_strimplementation forTextIOWrapperis well-designed:
- Handles reentrant calls gracefully (line 3524-3527)
- Checks for uninitialized state (line 3528-3530)
- Safely retrieves optional attributes using
get_attribute_opt- Includes relevant information: type name, name, mode, encoding
The use of let-chains for attribute extraction is clean and idiomatic.
4637-4638: LGTM: _blksize field addition.The
blksizefield is properly added asAtomicCell<i64>with a sensible default of 8192 bytes (DEFAULT_BUFFER_SIZE). This aligns with storingst_blksizefrom file stat information.Also applies to: 4660-4660
4673-4681: Good defensive warning for bool-as-FD.The warning when a bool is passed as a file descriptor helps catch a common Python gotcha where
boolis a subtype ofint. Since usingTrue/Falseas file descriptors (1/0) is almost certainly unintentional, this warning is valuable.
4732-4785: Excellent resource management for user-provided FDs.The
fd_is_owntracking properly prevents closing user-provided file descriptors when initialization fails. This is critical for correct resource management:
- Line 4732: Tracks ownership based on whether FD was passed by user
- Lines 4753-4755, 4767-4770, 4779-4785: On error paths, sets
zelf.fdto -1 if FD is not owned, preventing cleanup- Lines 4759-4763: Stores
st_blksizefrom stat for Unix systemsThis prevents closing FDs that the caller is still responsible for.
4795-4813: LGTM: Clean FileIO repr implementation.The repr implementation for
FileIOis well-structured and informative:
- Handles closed file state
- Includes type name, name/fd, mode, and closefd flag
- Consistent with the TextIOWrapper repr style
4844-4847: LGTM: Simple _blksize getter.The getter implementation is straightforward and correctly loads the value from the
AtomicCell.
4865-4878: LGTM: Improved readable/writable with error checking.Changing the return type from
booltoPyResult<bool>allows these methods to properly report errors when the file is closed, rather than silently returning false. This improves error reporting clarity.
4921-4951: LGTM: Comprehensive EINTR handling in read.Both branches of the
readmethod properly implement PEP 475:
- Lines 4922-4931: Sized read with EINTR retry
- Lines 4937-4946: Read-to-end with EINTR retry
Both use the correct pattern: check for
EINTR, callcheck_signals(), and retry.
4966-4977: LGTM: Consistent EINTR handling in readinto.The
readintomethod follows the same EINTR retry pattern asread, ensuring interrupted system calls are automatically retried per PEP 475.
12-19: The hardcoded EAGAIN value of 11 does not apply to WASI targets. The cfg conditionany(not(target_arch = "wasm32"), target_os = "wasi")explicitly includes WASI targets, which will uselibc::EAGAIN(value 6 per WASI specification). The hardcoded 11 only applies to pure WASM targets without WASI. For pure WASM environments, a placeholder errno value is acceptable since errno semantics don't apply.Likely an incorrect or invalid review comment.
crates/derive-impl/src/pystructseq.rs (5)
22-63: LGTM!The addition of
cfg_attrstoParsedFieldand the refactoring of accessor methods to return&ParsedFieldinstead of&Identis a clean design that enables downstream code to access conditional compilation attributes.
84-91: LGTM!The cfg attribute collection logic correctly identifies
#[cfg(...)]attributes and preserves them for downstream use while removing only#[pystruct_sequence(...)]attributes.Also applies to: 144-148
207-219: LGTM!Field index constants are correctly wrapped with their corresponding cfg attributes, ensuring constants are only defined when their fields are active.
255-290: No action required. The generatedinto_tuplemethod correctly uses cfg attributes within the quote! macro system. These are not literalvec![]with cfg attributes in source code—they are TokenStream expansions generated by the proc-macro. When the quote! macro expands#(#visible_tuple_items)*and#(#skipped_tuple_items)*, it properly produces valid Rust code with cfg guards in the vector literal. This pattern is verified to work across multiple uses in the codebase (StatResultData, TimesResultData, TerminalSizeData, StructTimeData) and compiles successfully.Likely an incorrect or invalid review comment.
221-253: The review comment's concern about cfg attributes on array elements is incorrect. In Rust, block expressions can have outer attributes, so the syntax#[cfg(...)] { expr }is valid for use in array literals. The generated code pattern is legitimate Rust code. This is confirmed by real-world usage in the codebase:StatResultDataincrates/vm/src/stdlib/os.rsis decorated with#[pystruct_sequence_data]and contains cfg-guarded fields (st_blksize,st_blockswith#[cfg(not(windows))]). The RustPython project compiles successfully, proving this pattern works as intended.
Summary by CodeRabbit
New Features
Bug Fixes
✏️ Tip: You can customize this high-level summary in your review settings.