Various refactoring and changes to increase coverage#429
Various refactoring and changes to increase coverage#429aaronliu0130 wants to merge 6 commits intodevelopfrom
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (18)
✅ Files skipped from review due to trivial changes (9)
🚧 Files skipped from review as they are similar to previous changes (8)
📝 WalkthroughWalkthroughRemoved debug/sysconfig usages and simplified literal/width handling in cpplint; reworked CLI test setup to use pytest tmp_path_factory fixtures; added unit tests; raised pytest coverage threshold and enabled subprocess coverage; updated many sample configs, expected-output artifacts, and sample source files. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
I'm going to make a PR that changes the formatting to get rid of that toml conflict tomorrow. |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (4)
samples/silly-sample/.cpplint (1)
3-4: Consider removing or activating the no-opset noparentpath.Line 3 explicitly documents that Line 4 currently does nothing in this sample hierarchy. Optional cleanup: either add a parent config to make this setting testable, or remove the inert directive/TODO to keep fixture intent crisp.
If helpful, I can draft a minimal parent-config sample layout so
set noparentis exercised explicitly.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@samples/silly-sample/.cpplint` around lines 3 - 4, The TODO notes that "set noparent" is currently a no-op; either remove the inert directive and its TODO to keep the sample focused, or add a minimal parent config so "set noparent" is exercised. To fix, either delete the "set noparent" line and the preceding TODO comment, or create a parent-config file and update the sample so the child .cpplint inherits from that parent, demonstrating that "set noparent" prevents inheritance; reference the literal "set noparent" directive and the TODO comment when applying the change.samples/silly-sample/cfg.def (1)
1-1:cfg.defis now exercising the CLI path for header extensions.Line 1 sets
--headers=won the command line, and cpplint applies that during argument parsing before it loads.cpplint. If this fixture is meant to validate config-file behavior, move the header-extension setting into the config file and keep the command line focused on--config.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@samples/silly-sample/cfg.def` at line 1, The test fixture is passing --headers=w on the command line (cfg.def) which forces CLI parsing before .cpplint is loaded; remove the --headers=w token from cfg.def and instead add the equivalent headers = w setting into the .cpplint config file so the header-extension behavior is exercised via config-file parsing; leave the command line in cfg.def focused on --config=.cpplint (and any other non-header CLI flags) so the test validates config-file behavior.samples/silly-sample/simple.def (1)
1-1: Consider keepingsimple.defon the zero-flag path.Line 1 now hard-codes
--output=emacs --counting=detailed, so this fixture no longer doubles as the baseline no-arg smoke test. I'd split the detailed/emacs variant into its own fixture and leavesimple.defonsrc/*.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@samples/silly-sample/simple.def` at line 1, Move the hard-coded flags out of samples/silly-sample/simple.def so simple.def remains the zero-flag baseline: replace the single-line entry that currently contains "--output=emacs --counting=detailed src/*" with just "src/*", and create a separate fixture (e.g., samples/silly-sample/simple.detailed.def or simple.emacs.def) that contains the full "--output=emacs --counting=detailed src/*" line; update any test harness references to point to the new detailed fixture if they were relying on the old path.cpplint_clitest.py (1)
229-229: Use a meaningful class docstring (or remove the empty one).
"""""")adds noise without context.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpplint_clitest.py` at line 229, Replace the empty triple-quote docstring currently present in the class at the location with either a concise, meaningful docstring that summarizes the class purpose, responsibilities, and any important usage notes (or remove the empty string entirely if no class-level documentation is needed); locate the empty """ """ immediately under the class definition in cpplint_clitest.py and update it to a short descriptive sentence or delete it to eliminate noise.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cpplint_clitest.py`:
- Around line 231-236: The fixture _tmp_root currently creates a single
class-scoped temp directory (request.cls._root) which causes shared state across
tests (CPPLINT.cfg and test.cpp are written/reused) — change _tmp_root to a
function-scoped fixture so each test gets its own isolated temp dir, create the
dir per-test (e.g. assign to request.node._root or request.instance._root for
class methods) and still remove it after yield; update any tests that referenced
request.cls._root to use the per-test attribute so CPPLINT.cfg and test.cpp are
created in a test-local directory.
In `@cpplint.py`:
- Around line 6204-6206: The current check uses re.match(r"\d+", tok) and treats
any token starting with digits as an integer literal, causing false negatives;
replace that prefix match with a full-match using re.fullmatch and a regex that
fully matches C/C++ integer-literal forms (decimal, hex 0[xX]..., binary
0[bB]..., octal 0... and optional suffixes like u, U, l, L, ul, ull, etc.) so
that the token variable tok is only considered a numeric literal when it
entirely conforms to an integer-literal pattern; update the check in the
function/block referencing tok to use that full-match regex.
In `@samples/silly-sample/junit.def`:
- Around line 6-127: The testsuite header in the JUnit XML (<testsuite ...
tests="123" failures="123">) does not match the actual payload (2 <testcase> and
2 <failure> entries); update the testsuite counts to reflect the real payload
(tests="2" failures="2") or change the generator that creates the file so it
emits one <testcase> per lint finding instead of collapsing them into two
testcases; locate the testsuite element in junit.def and either adjust the
attributes tests and failures to 2 or modify the emitter logic that produces the
<testcase> nodes so the counts and nodes remain consistent.
---
Nitpick comments:
In `@cpplint_clitest.py`:
- Line 229: Replace the empty triple-quote docstring currently present in the
class at the location with either a concise, meaningful docstring that
summarizes the class purpose, responsibilities, and any important usage notes
(or remove the empty string entirely if no class-level documentation is needed);
locate the empty """ """ immediately under the class definition in
cpplint_clitest.py and update it to a short descriptive sentence or delete it to
eliminate noise.
In `@samples/silly-sample/.cpplint`:
- Around line 3-4: The TODO notes that "set noparent" is currently a no-op;
either remove the inert directive and its TODO to keep the sample focused, or
add a minimal parent config so "set noparent" is exercised. To fix, either
delete the "set noparent" line and the preceding TODO comment, or create a
parent-config file and update the sample so the child .cpplint inherits from
that parent, demonstrating that "set noparent" prevents inheritance; reference
the literal "set noparent" directive and the TODO comment when applying the
change.
In `@samples/silly-sample/cfg.def`:
- Line 1: The test fixture is passing --headers=w on the command line (cfg.def)
which forces CLI parsing before .cpplint is loaded; remove the --headers=w token
from cfg.def and instead add the equivalent headers = w setting into the
.cpplint config file so the header-extension behavior is exercised via
config-file parsing; leave the command line in cfg.def focused on
--config=.cpplint (and any other non-header CLI flags) so the test validates
config-file behavior.
In `@samples/silly-sample/simple.def`:
- Line 1: Move the hard-coded flags out of samples/silly-sample/simple.def so
simple.def remains the zero-flag baseline: replace the single-line entry that
currently contains "--output=emacs --counting=detailed src/*" with just "src/*",
and create a separate fixture (e.g., samples/silly-sample/simple.detailed.def or
simple.emacs.def) that contains the full "--output=emacs --counting=detailed
src/*" line; update any test harness references to point to the new detailed
fixture if they were relying on the old path.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: e71562f2-9258-4ed1-a059-4245727784e9
📒 Files selected for processing (18)
cpplint.pycpplint_clitest.pycpplint_unittest.pypyproject.tomlsamples/silly-sample/.cpplintsamples/silly-sample/CPPLINT.cfgsamples/silly-sample/cfg.defsamples/silly-sample/eclipse.defsamples/silly-sample/filters.defsamples/silly-sample/flags.defsamples/silly-sample/gsed.defsamples/silly-sample/includeorder_cfirst.defsamples/silly-sample/junit.defsamples/silly-sample/sed.defsamples/silly-sample/simple.defsamples/silly-sample/src/sillycode.cppsamples/silly-sample/src/sillycode.wsamples/silly-sample/vs7.def
| # Matches both decimals and other integer literals | ||
| # since the latter have to be prefixed by 0-something | ||
| continue |
There was a problem hiding this comment.
Tighten integer-literal matching at Line 6204 to avoid false negatives.
re.match(r"\d+", tok) only checks the prefix, so any token starting with digits is treated as constant. This can incorrectly bypass VLA warnings for malformed tokens.
Suggested fix
- if re.match(r"\d+", tok):
- # Matches both decimals and other integer literals
- # since the latter have to be prefixed by 0-something
+ if re.fullmatch(
+ r"(?:"
+ r"0[xX][0-9a-fA-F]+|"
+ r"0[bB][01]+|"
+ r"0[0-7]*|"
+ r"[1-9][0-9]*"
+ r")(?:[uU](?:ll?|LL?)?|(?:ll?|LL?)[uU]?)?",
+ tok,
+ ):
+ # Match full integer literals (decimal/octal/hex/binary with common suffixes)
continue🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpplint.py` around lines 6204 - 6206, The current check uses re.match(r"\d+",
tok) and treats any token starting with digits as an integer literal, causing
false negatives; replace that prefix match with a full-match using re.fullmatch
and a regex that fully matches C/C++ integer-literal forms (decimal, hex
0[xX]..., binary 0[bB]..., octal 0... and optional suffixes like u, U, l, L, ul,
ull, etc.) so that the token variable tok is only considered a numeric literal
when it entirely conforms to an integer-literal pattern; update the check in the
function/block referencing tok to use that full-match regex.
There was a problem hiding this comment.
In what case would a VLA meet our current loose matching?
We assume that the code compiles.
There was a problem hiding this comment.
@aaronliu0130
Short answer: whenever the size expression contains a digit-prefixed token glued to another operator/identifier that our splitter doesn’t break apart.
Details:
- In CheckLanguage() we split the bracket expression with:
- re.split(r"\s|+|-|*|/|<<|>>]", expr)
- That does not split on |, &, ^, %, ?:, &&, ||, or digit separators (’). If there’s no space around those operators, the whole thing stays as one token.
- Our current check uses re.match(r"\d+", tok). That accepts any token that merely starts with digits, even if the rest of the token makes it a non-constant expression.
Concrete, compiling cases that can slip through and suppress the VLA warning:
- int n; int a[1|n]; // token is "1|n" (no spaces) → re.match(r"\d+") hits "1" → treated as constant → false negative
- int n; int a[8&n]; // token "8&n" → same issue
- int n; int a[1^n]; // token "1^n" → same issue
- int n; int a[1?n:2]; // token "1?n:2" (no spaces) → same issue
- int n; int a[0x10|n]; // token "0x10|n" → same issue
- int n; int a[10'000|n];// C++14 separators; token "10'000|n" → same issue
These are all valid syntax; under gnu++ modes many compilers accept VLAs as an extension, so the code “compiles,” but we still want to flag it.
Why the suggested change helps:
- Switching from re.match(r"\d+") to a re.fullmatch(...) for integer literals ensures tokens like "1|n" or "8&n" no longer pass as “constants,” so is_const becomes False and we emit the VLA warning as intended.
- It also correctly accepts hex/bin/octal forms and common integer suffixes (u/U, l/L, ll/LL), which our check should treat as constant.
Minimal, safe tweak at Line 6204:
- Replace the prefix match with a full match for integer literals:
# Line 6204 (approx)
if re.fullmatch(
r"(?:"
r"0[xX][0-9a-fA-F]+|"
r"0[bB][01]+|"
r"0[0-7]*|"
r"[1-9][0-9]*"
r")(?:[uU](?:ll?|LL?)?|(?:ll?|LL?)[uU]?)?",
tok,
):
continueOptional hardening (future):
- Broaden the splitter to also split on r"[|&^%?:()]" (and maybe &&, ||). That reduces reliance on regex subtlety and makes the VLA heuristic more robust to operator adjacency.
If you’d like, I can prep a tiny unit test matrix for the examples above to prevent regressions.
There was a problem hiding this comment.
Unless there's evidence otherwise, I feel like such magic numbers/expressions are too rare to evaluate that much regex or operator splitting for.
There was a problem hiding this comment.
@aaronliu0130
Totally hear you on avoiding heavy work in a hot loop. Quick context on why this came up:
- With the current prefix check at Line 6204 (re.match(r"\d+", tok)), tokens that start with a digit but also include another operator (that our splitter doesn’t split on) get treated as “constant,” so VLA warnings can be skipped. Examples that can compile (e.g., under gnu++ modes) but we’d still want to flag:
- int n; int a[1|n];
- int n; int a[8&n];
- int n; int a[1?n:2];
- int n; int a[0x10|n];
To keep cost near-zero while fixing those cases, two lightweight options:
Option A (gated, precompiled fullmatch; very low overhead)
- Precompile once at module load and only call it when the token starts with a digit.
- No extra splitting; keeps the fast path for non-digit tokens.
Python outline:
# near other compiled patterns
_INT_LITERAL_RE = re.compile(
r"(?:0[xX][0-9a-fA-F]+|0[bB][01]+|0[0-7]*|[1-9][0-9]*)"
r"(?:[uU](?:ll?|LL?)?|(?:ll?|LL?)[uU]?)?$"
)
# in the VLA check loop (Line 6204 vicinity)
if tok and tok[0].isdigit() and _INT_LITERAL_RE.fullmatch(tok):
continueOption B (ultra-cheap, no regex at all)
- Treat digit-starting tokens as “constant” only if they consist solely of characters from a small whitelist of integer-literal chars; otherwise, fall through and we’ll flag the VLA:
_ALLOWED = set("0123456789abcdefABCDEFxXbBuUlL'") # apostrophes already stripped earlier in elision
if tok and tok[0].isdigit() and all(c in _ALLOWED for c in tok):
continueThis is O(len(tok)) and avoids the regex engine entirely. It’s a bit less exact than Option A but still blocks the “1|n”/“8&n” style false negatives with minimal code.
If you’d prefer to keep the current behavior, I’m fine to drop this for now. Alternatively, I can:
- wire up a tiny unit test matrix demonstrating the false-negative cases above, or
- open a follow-up issue to evaluate Option A/B later.
What’s your preference: keep-as-is, go with Option A (gated, precompiled), or Option B (ultra-cheap whitelist)?
✏️ Learnings added
Learnt from: aaronliu0130
Repo: cpplint/cpplint PR: 429
File: cpplint.py:6204-6206
Timestamp: 2026-04-14T02:11:23.080Z
Learning: Preference note: In cpplint.py CheckLanguage() VLA detection, maintainers prefer to avoid heavier regex or operator splitting in hot loops unless there is concrete evidence of misclassification; acceptable compromises include a gated precompiled fullmatch for integer literals or a very cheap character-whitelist check.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
Bumps coverage from 90% to 95%. In unittest from 90% to 92%, and in clitest from 0% to 82%. See https://pytest-cov.readthedocs.io/en/latest/subprocess-support.html
Was used for debug prints, which lowered test coverage. Modern debugging tools are powerful enough to not need debug prints.
(linewidth) removed workaround for python<=3.2 (vla) removed redundant hexadecimal check add todos questioning carveout for vertical whitespace after function header, two competing mechanisms for determining whether we're in a namespace, and absence of stdin clitest (latter of which would require some refactoring itself to reuse run_and_check() on an existing def but with stdin) since this work to increase coverage made me review a good chunk of the code (because of the wide distribution of lines missed), i also did some random refactoring along the way: converting to an fstring and surrounding a complicated conditional-precedence–based logic with parentheses
increase cpplint.py's test coverage to 96% add quite a few problems to sillycode; update related defs add a def for each missing output format make simple have counting=detailed test extensions and root CLI flags in flags.def move headers from cfg to command-line in cfg.def as default .cfg already tests headers in config test some default values for flags add unittests for corner cases we already have in cpplint.py
Also migrate pytest settings to toml
806156f to
b813d8d
Compare
please do NOT squash. there's a lot of changes, so i think this is best merged with rebase and merge, i.e. preserving the separation between commits
we can now bump the coverage requirement from 89% to 95%, now that the test coverage is 96%.
Summary by CodeRabbit
Refactor
Tests
Chores