Skip to content

Conversation

@edwinyyyu
Copy link
Contributor

@edwinyyyu edwinyyyu commented Dec 12, 2025

Purpose of the change

When search limit is very low e.g. 1-3, and/or episodes are very small e.g. on the order of 0-50 characters, returned results are likely to be terrible because the episode budget is filled based on weighted index proximity instead of relevance.

Description

The current implementation fills up the return limit budget by taking the episodes closest to the nuclear episode first. When a good episode happens to be part of the context of a bad nuclear episode, it is possible that the reranker gives a high score to the context belonging to the bad nuclear episode, so that the good episode is not included.

This is an easy issue to hit especially in contrived scenarios where consecutive episodes are likely to be unrelated.

Discovered by #771 failing integration tests.
Just rank individual episodes when the search limit is low.

Type of change

[Please delete options that are not relevant.]

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Refactor (does not change functionality, e.g., code style improvements, linting)
  • Documentation update
  • Project Maintenance (updates to build scripts, CI, etc., that do not affect the main project)
  • Security (improves security without changing functionality)

How Has This Been Tested?

  • Unit Test
  • Integration Test
  • End-to-end Test
  • Test Script (please provide)
  • Manual verification (list step-by-step instructions)

Checklist

  • I have signed the commit(s) within this pull request
  • My code follows the style guidelines of this project (See STYLE_GUIDE.md)
  • I have performed a self-review of my own code
  • I have commented my code
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added unit tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules
  • I have checked my code and corrected any misspellings

Maintainer Checklist

  • Confirmed all checks passed
  • Contributor has signed the commit(s)
  • Reviewed the code
  • Run, Tested, and Verified the change(s) work as expected

Signed-off-by: Edwin Yu <edwinyyyu@gmail.com>
@edwinyyyu edwinyyyu changed the title Replace weighted index proximity with another ranking Improve low-limit search results Dec 12, 2025
Signed-off-by: Edwin Yu <edwinyyyu@gmail.com>
@edwinyyyu
Copy link
Contributor Author

#792 includes these changes.

@edwinyyyu edwinyyyu marked this pull request as draft December 18, 2025 19:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant