Skip to content

Conversation

@youknowone
Copy link
Member

@youknowone youknowone commented Dec 12, 2025

with better collecting, async support and win32

I'd like to rename the script to mark_test_failures or something like that to shows better what it is

Summary by CodeRabbit

  • Chores
    • Improved test automation script with enhanced reliability, better handling of test identification, and cross-platform compatibility improvements.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 12, 2025

Walkthrough

The test-fixing script undergoes substantial refactoring to replace in-place line-wise modifications with a locate-modify-apply sequence. Results parsing is updated to extract failing tests from targeted patterns, new helper functions locate test definitions and apply modifications in reverse order, and file I/O transitions to Path-based methods with Windows executable support.

Changes

Cohort / File(s) Summary
Test result parsing and modification flow
scripts/fix_test.py
Results parsing now collects only FAIL/ERROR results and extracts test names via targeted pattern matching; replaces previous general test line parsing with new find_test_lineno() helper to locate test function definitions by class and method name; introduces apply_modifications() to apply multiple modifications in reverse line order; removes old modify_test/modify_test_v2 functions; updates file I/O to use Path.read_text() and write_text(); adds Windows support for RustPython executable naming (.exe suffix); adds input path validation; improves logging to report modification targets and counts.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20–25 minutes

  • find_test_lineno() logic: Verify AST-based test function location handles both regular and async test methods correctly, and edge cases where tests cannot be located.
  • apply_modifications() logic: Confirm reverse-order application prevents line number offset issues when inserting decorators and comments.
  • Results parsing: Ensure FAIL/ERROR extraction correctly parses test output and extracts class/method names from target patterns.
  • Deduplication and file I/O: Verify modifications list deduplicates by (class, method) and that Path-based I/O handles encoding/edge cases appropriately.

Suggested reviewers

  • reactive-firewall

Poem

🐰 Hops through test logs with gleeful delight,
Locating lost tests and making them right,
No more line shifts—we reverse with care,
Each modification finds its true lair!
With Path and with parsing, the script now shines,
A refactored warren of fix-perfect lines! 🌙

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Title check ❓ Inconclusive The title 'Fix fix_test.py' is vague and generic. While it references the file being modified, it does not meaningfully convey the specific improvements made (better collecting, async support, Windows compatibility, or the refactored test-handling approach). Revise the title to be more descriptive of the key improvements, such as 'Refactor fix_test.py with async support and Windows compatibility' or 'Improve test failure collection and apply modifications robustly'.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@youknowone youknowone added the skip:ci Skip running the ci label Dec 12, 2025
@youknowone youknowone marked this pull request as ready for review December 12, 2025 10:46
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (3)
scripts/fix_test.py (3)

105-124: find_test_lineno() should iterate a.body directly and avoid over-restrictive structure assumptions.
Current ast.iter_fields() + key == "body" works but is indirect; also consider catching SyntaxError from ast.parse (script should warn and skip instead of crashing on a bad file).

 def find_test_lineno(file: str, test: list[str]) -> tuple[int, int] | None:
@@
-    a = ast.parse(file)
-    for key, node in ast.iter_fields(a):
-        if key == "body":
-            for n in node:
+    try:
+        a = ast.parse(file)
+    except SyntaxError:
+        return None
+
+    for n in a.body:
                 match n:
                     case ast.ClassDef():
                         if len(test) == 2 and test[0] == n.name:
                             for fn in n.body:
                                 match fn:
                                     case ast.FunctionDef() | ast.AsyncFunctionDef():
                                         if fn.name == test[-1]:
                                             return (fn.lineno, fn.col_offset)
                     case ast.FunctionDef() | ast.AsyncFunctionDef():
                         if n.name == test[0] and len(test) == 1:
                             return (n.lineno, n.col_offset)
     return None

140-153: Add a timeout (and surface stderr) to avoid hanging forever on known-hanging tests.
Docs say “skip the tests that hang” but the script itself can still hang in subprocess.run(...).

-    result = subprocess.run(
+    result = subprocess.run(
         [rustpython_location, "-m", "test", "-v", test_name],
         capture_output=True,
         text=True,
+        timeout=60 * 20,
     )

Optionally catch subprocess.TimeoutExpired and print a clear message with next steps.


156-188: Batch “locate → apply in reverse” flow is a good direction; just ensure the “platform/force” flags aren’t dead.
The new Path-based I/O + reverse-apply avoids offset bugs—nice. But --force and --platform are currently unused in this file (at least in the shown code); either wire them in (e.g., allow re-marking / platform-gated TODO text) or remove them to avoid misleading CLI. Based on repo guidance, ensure the inserted markers match the project’s required patterns for Lib/test/** edits (unittest.expectedFailure / unittest.skip("TODO: RustPython ...") + # TODO: RUSTPYTHON ...). Based on learnings, apply only those modifications to tests.

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5a5b721 and 7bd59fa.

📒 Files selected for processing (1)
  • scripts/fix_test.py (3 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.py: In most cases, Python code should not be edited; bug fixes should be made through Rust code modifications only
Follow PEP 8 style for custom Python code
Use ruff for linting Python code

Files:

  • scripts/fix_test.py
🧠 Learnings (12)
📓 Common learnings
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T12:17:28.606Z
Learning: Applies to Lib/test/**/*.py : Only acceptable modifications to test files are: (1) Removing `unittest.expectedFailure` decorators and upper TODO comments when tests actually pass, (2) Adding `unittest.expectedFailure` decorators when tests cannot be fixed
📚 Learning: 2025-11-29T12:17:28.606Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T12:17:28.606Z
Learning: Applies to Lib/test/**/*.py : Only acceptable modifications to test files are: (1) Removing `unittest.expectedFailure` decorators and upper TODO comments when tests actually pass, (2) Adding `unittest.expectedFailure` decorators when tests cannot be fixed

Applied to files:

  • scripts/fix_test.py
📚 Learning: 2025-11-29T12:17:28.606Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T12:17:28.606Z
Learning: Applies to Lib/test/**/*.py : When modifying test files in `Lib/test/`, use `unittest.skip("TODO: RustPython <reason>")` or `unittest.expectedFailure` with `# TODO: RUSTPYTHON <reason>` comment to mark unsupported features

Applied to files:

  • scripts/fix_test.py
📚 Learning: 2025-11-29T12:17:28.606Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T12:17:28.606Z
Learning: Applies to Lib/test/**/*.py : NEVER comment out or delete test code lines except for removing `unittest.expectedFailure` decorators and upper TODO comments

Applied to files:

  • scripts/fix_test.py
📚 Learning: 2025-11-29T12:17:28.606Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T12:17:28.606Z
Learning: Applies to Lib/test/**/*.py : When a test cannot pass due to missing language features, keep it as expectedFailure and document the reason instead of modifying test code

Applied to files:

  • scripts/fix_test.py
📚 Learning: 2025-11-29T12:17:28.606Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T12:17:28.606Z
Learning: Use `cargo run -- script.py` instead of the standard `python` command to run and test Python code with RustPython

Applied to files:

  • scripts/fix_test.py
📚 Learning: 2025-11-29T12:17:28.606Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T12:17:28.606Z
Learning: Run `cargo test --workspace --exclude rustpython_wasm` to execute Rust unit tests

Applied to files:

  • scripts/fix_test.py
📚 Learning: 2025-09-28T22:22:55.921Z
Learnt from: reactive-firewall
Repo: RustPython/RustPython PR: 6176
File: .github/workflows/Check_Tests.yml:133-141
Timestamp: 2025-09-28T22:22:55.921Z
Learning: In the RustPython project's CI-5974-Test-RustPython-Integration action, the override-rustpython-path input is marked as required but has runtime fallback logic that defaults to RUSTPYTHONPATH environment variable or "Lib" if neither is provided, making explicit specification unnecessary in most cases.

Applied to files:

  • scripts/fix_test.py
📚 Learning: 2025-11-29T12:17:28.606Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T12:17:28.606Z
Learning: Applies to Lib/**/*.py : Minimize modifications to CPython standard library files in the `Lib/` directory; modifications should be minimal and only to work around RustPython limitations

Applied to files:

  • scripts/fix_test.py
📚 Learning: 2025-11-29T12:17:28.606Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T12:17:28.606Z
Learning: Applies to Lib/**/*.py : Add a `# TODO: RUSTPYTHON` comment when modifications are made to files in the `Lib/` directory

Applied to files:

  • scripts/fix_test.py
📚 Learning: 2025-11-29T12:17:28.606Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T12:17:28.606Z
Learning: Applies to **/*.py : In most cases, Python code should not be edited; bug fixes should be made through Rust code modifications only

Applied to files:

  • scripts/fix_test.py
📚 Learning: 2025-11-29T12:17:28.606Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T12:17:28.606Z
Learning: Use the `python` command explicitly when comparing behavior with CPython or running test suites against CPython

Applied to files:

  • scripts/fix_test.py
🪛 Flake8 (7.3.0)
scripts/fix_test.py

[error] 19-19: 'itertools' imported but unused

(F401)


[error] 20-20: 'platform' imported but unused

(F401)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Check Rust code with clippy
🔇 Additional comments (1)
scripts/fix_test.py (1)

55-95: Python code modifications not appropriate per project guidelines.

While the parsing logic concern has merit (the code relies on specific output markers like "Run tests sequentially" that could be fragile), the coding guidelines explicitly state: "In most cases, Python code should not be edited; bug fixes should be made through Rust code modifications only." If the underlying test runner or output format is problematic, address the root cause in Rust rather than modifying the Python parsing code.

Comment on lines 19 to +21
import itertools
import platform
import sys
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove unused imports (flake8 F401) to keep the script clean.
itertools (Line 19) and platform (Line 20) are unused per static analysis; sys (Line 21) is used.

 import argparse
 import ast
-import itertools
-import platform
 import sys
 from pathlib import Path
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
import itertools
import platform
import sys
import argparse
import ast
import sys
from pathlib import Path
🧰 Tools
🪛 Flake8 (7.3.0)

[error] 19-19: 'itertools' imported but unused

(F401)


[error] 20-20: 'platform' imported but unused

(F401)

🤖 Prompt for AI Agents
In scripts/fix_test.py around lines 19 to 21, the imports include unused modules
causing flake8 F401; remove the unused imports itertools and platform and keep
only the used import sys (i.e., replace the three-import block with a single
import for sys), ensuring any references to those removed modules are also
removed or updated if present elsewhere in the file.

Comment on lines 98 to 103
def path_to_test(path) -> list[str]:
return path.split(".")[2:]
# path format: test.module_name[.submodule].ClassName.test_method
# We need [ClassName, test_method] - always the last 2 elements
parts = path.split(".")
return parts[-2:] # Get class name and method name

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, let's look at the structure of the repository and locate the file
find . -name "fix_test.py" -type f

Repository: RustPython/RustPython

Length of output: 86


🏁 Script executed:

# Get the file size to see if we can read it entirely
wc -l scripts/fix_test.py

Repository: RustPython/RustPython

Length of output: 88


🏁 Script executed:

# Read the file around the mentioned lines 98-103 and 169-182
cat -n scripts/fix_test.py | head -200

Repository: RustPython/RustPython

Length of output: 8340


🏁 Script executed:

# Search for all calls to path_to_test in the file
rg "path_to_test" scripts/fix_test.py -n

Repository: RustPython/RustPython

Length of output: 157


🏁 Script executed:

# Let's also verify the unittest output format assumption by checking if there are any tests or documentation
find . -name "*.py" -type f -exec grep -l "Run tests sequentially" {} \; 2>/dev/null | head -5

Repository: RustPython/RustPython

Length of output: 86


Bug confirmed: path_to_test() misidentifies test class and method names.

The function discards test.name (the actual method name) and returns the wrong class name. Given unittest's output format test_method (package.module.ClassName) ... RESULT, the parsing at lines 78-88 correctly extracts test.name = "test_method" and test.path = "package.module.ClassName". However, path_to_test(test.path) splits the path and returns parts[-2:] (e.g., ["module", "ClassName"] instead of ["ClassName", "test_method"]), causing find_test_lineno() to search for the wrong class name and use the class name as the method name. This prevents test discovery and causes marked tests to be skipped.

The suggested fix—passing both test.path and test.name to derive [class_name, method_name]—correctly addresses the root cause.

🤖 Prompt for AI Agents
In scripts/fix_test.py around lines 98 to 103, path_to_test currently only takes
the dotted path and returns parts[-2:], which misidentifies class and method;
update path_to_test to accept both the dotted path and the test.name (method
name) and return [ClassName, method_name] (extract the last element of the path
as ClassName and use the provided test.name for method_name), then update all
callers to pass both values so find_test_lineno searches for the correct class
and method.

Comment on lines +127 to 137
def apply_modifications(file: str, modifications: list[tuple[int, int]]) -> str:
"""Apply all modifications in reverse order to avoid line number offset issues."""
lines = file.splitlines()
fixture = "@unittest.expectedFailure"
# Sort by line number in descending order
modifications.sort(key=lambda x: x[0], reverse=True)
for lineno, col_offset in modifications:
indent = " " * col_offset
lines.insert(lineno - 1, indent + fixture)
lines.insert(lineno - 1, indent + "# TODO: RUSTPYTHON")
return "\n".join(lines)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

apply_modifications() should avoid double-inserting when a test is already marked.
Re-running the script will currently stack # TODO: RUSTPYTHON and @unittest.expectedFailure repeatedly. Consider checking the preceding lines before inserting.

 def apply_modifications(file: str, modifications: list[tuple[int, int]]) -> str:
@@
     for lineno, col_offset in modifications:
         indent = " " * col_offset
-        lines.insert(lineno - 1, indent + fixture)
-        lines.insert(lineno - 1, indent + "# TODO: RUSTPYTHON")
+        insert_at = lineno - 1
+        prev = lines[max(0, insert_at - 3):insert_at]
+        if any(l.strip() == fixture for l in prev):
+            continue
+        lines.insert(insert_at, indent + fixture)
+        lines.insert(insert_at, indent + "# TODO: RUSTPYTHON")
     return "\n".join(lines)
🤖 Prompt for AI Agents
In scripts/fix_test.py around lines 127 to 137, apply_modifications currently
inserts the fixture and TODO unconditionally which causes duplicate lines on
repeated runs; modify the loop to check the existing surrounding lines before
inserting (for example examine lines[lineno-2] and lines[lineno-3] safely) and
only insert the "@unittest.expectedFailure" and the "# TODO: RUSTPYTHON" if they
are not already present (use trimmed comparisons and guard against index
errors), preserving indentation and original ordering so rerunning the script is
idempotent.

for fn in n.body:
match fn:
case ast.FunctionDef():
case ast.FunctionDef() | ast.AsyncFunctionDef():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should consider to make this script to generate the json payload lib_updater is expecting, and (and use its python API to patch the files). both are doing the same thing (ish)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

skip:ci Skip running the ci

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants