-
Notifications
You must be signed in to change notification settings - Fork 2.4k
feat(plugins): add spellcheck plugin for query correction #5378
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Introduces a new spellcheck plugin that suggests corrections for user queries. The plugin can utilize either the `pyspellchecker` library or Google Autocomplete for suggestions, configurable via settings. Updates to `requirements.txt` include the addition of `pyspellchecker`. Unit tests for the new plugin functionality have also been added.
I've only had a quick look at it so far, but I can already say that it's great work. Thank you very much 👍 Within the review of PR #3837, one important question has not yet been answered: we're not sure about the memory footprint of using pyspellchecker |
- Fix Black formatting issues (dictionary comprehensions, function signatures) - Move imports to module level to resolve pylint C0415 warnings - Replace Protocol ellipsis with NotImplementedError for better type safety - Fix test mocking paths after import reorganization - Add module docstring to test file
|
Consider adding EDIT: I see the pr #3761 (comment) and the issue #3760 Hard disagree on the final decision. But okay.
Something like this... fail_fast: true
default_install_hook_types: [pre-commit, pre-push]
repos:
# General hooks
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v6.0.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-yaml
- id: check-json
- id: check-merge-conflict
- id: check-added-large-files
args: ['--maxkb=5000']
- id: check-case-conflict
- id: check-merge-conflict
- id: mixed-line-ending
args: ['--fix=lf']
- repo: local
hooks:
- id: format-python
name: format Python code
entry: ./manage format.python
language: system
types: [python]
pass_filenames: false
always_run: true
- id: test-pylint
name: run pylint
entry: ./manage test.pylint
language: system
types: [python]
pass_filenames: false
always_run: true
- repo: local
hooks:
- id: nose2
name: run nose2 tests
entry: uv run nose2 -F tests.unit
# tests donn't work with mp plugin
# entry: uv run --frozen nose2 -qq -F --plugin=nose2.plugins.mp --processes=$(nproc) tests.unit
language: python
types: [python]
pass_filenames: false
always_run: true
stages: [pre-push] # takes a long time, only run on pre-push |
- Extract correction logic into _get_correction helper method - Reduce return statements and improve code organization - iIprove spellcheck plugin query correction logic
|
Amazing work! I've tested it with very good results.
It adds at most (based on my "trust me bro" tests) 20 MB additional memory usage with pyspellchecker enabled. The issue here is that this library is CPU intensive, should be caped to 80 characters per search as it overloads the synchronous nature of WSGI, making the experience with concurrent requests very poor. |
Memory and/or CPU intensive tasks -- resource intensive tasks -- should not be performed by the SearXNG server. SearXNG is an aggregator and such tasks should be provided by other services to integrate them into SearXNG. For pyspellchecker, it would be possible (not part of SearXNG) to implement a (local) service that is integrated into SearXNG via HTTP (like other engines, answerer or plugin). |
Working on a solution |

This PR closes: #3837
What does this PR do?
Implements a modern spellcheck plugin that provides "Try searching for:" suggestions for misspelled search queries. The plugin supports two providers:
The plugin follows modern SearXNG plugin architecture with SOLID principles, protocol-based design, and comprehensive unit testing.
Why is this change important?
How to test this PR locally?
Enable the plugin in :
Install dependencies:
Test with misspelled queries:
googleprovider; it supports langaugespyspellcheckerdoesn't support:Run unit tests:
Related issues
Screenshots