Skip to content

Conversation

@tusharsoni52
Copy link
Contributor

Summary

This PR introduces a new extension point, processEqualities(), inside DiffRowGenerator to address the HTML-escaping issue described in #219.

Previously, equal (unchanged) segments were passed through HTML normalization only once.
Because < and > were normalized to &amp;lt; and &amp;gt;, equalities that began with & caused incorrect inline diff grouping (e.g., &amp;lt; vs &amp;gt;). This produced invalid HTML when inlineDiffByWord(true) was used.

What This PR Changes

✔ Adds a new protected hook:

protected String processEqualities(String text)

✔ All equality chunks now pass through this hook before becoming DiffRow(Tag.EQUAL, ...).
✔ Default behavior is no modification (fully backward compatible).
✔ Users can override this hook to customize HTML or text transformations after diffing, solving the core issue.

Why this Fix Works

The problem in #219 occurs because the diff logic compares normalized text, but the final output does not normalize equal segments the same way as diffs.

By exposing processEqualities(), we give users a consistent post-diff transformation path.
This ensures that equal text is handled symmetrically to diffed text and prevents HTML-breaking cases like &amp;lt; / &amp;gt; being partially diffed.

Backward Compatibility

  • Existing behavior is preserved.
  • No breaking API changes.
  • The feature is opt-in — users override it only if needed.

Tests Included

✔ Added DiffRowGeneratorEqualitiesTest
✔ Verifies that:

  • Equalities now route through processEqualities()
  • Inline diffs still work correctly
  • The issue Line normalization before diff #219 reproduction scenario no longer produces broken HTML

Fixes

Fixes #219

…rocessed for unchanged lines (Fixes java-diff-utils#219)

- Introduces a new protected method `processEqualities(String)` in DiffRowGenerator
- Builder exposes `.processEqualities(Function<String,String>)`
- Equal (unchanged) lines now invoke processEqualities()
- Inline diffs remain unchanged (as expected in Option 3)
- Added new test suite DiffRowGeneratorEqualitiesTest
- Updated documentation and Javadoc for new extension point
- Fixes java-diff-utils#219 (HTML escaping issue when inline diff by word)
@wumpz wumpz merged commit 696edc4 into java-diff-utils:master Dec 10, 2025
4 checks passed
@wumpz
Copy link
Collaborator

wumpz commented Dec 10, 2025

thx for your pr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Line normalization before diff

2 participants