-
Notifications
You must be signed in to change notification settings - Fork 2.4k
[feat] plugins: Add "define <term>" Plugin to Show Definitions in Searches #5366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
| @dataclass(eq=True, frozen=False) | ||
| class Definition: | ||
| """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, the concept of result types has not yet been fully implemented and is therefore somewhat difficult to recognize --> In the model of result types, all answers are of the type BaseAnswer.
In terms of data structure, this class here is very similar to the type Translations ..
searxng/searx/result_types/answer.py
Lines 98 to 99 in edfa71c
| class Translations(BaseAnswer, kw_only=True): | |
| """Answer type with a list of translations. |
which has a list of examples ..
searxng/searx/result_types/answer.py
Lines 124 to 125 in edfa71c
| translations: "list[Translations.Item]" | |
| """List of translations.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @return42, thanks for taking a peek!
I wanted to clarify as it may not have been clear through my implementation & PR description.
The Definition class here isn’t an answer type, it's only an internal data model used to normalize and store entries fetched from a provider (currently Wiktionary). The intended purpose is API abstraction to let new providers easily plug in later by returning the same structure.
The BaseAnswer emitted by the plugin is a standard Translations card, and the Definition is converted here in DefineHandler with the custom template:
searxng/searx/plugins/define.py
Lines 272 to 335 in 77967b2
| def _make_definitions_answer(self, definitions: list[Translations.Item], url: str | None) -> Translations: | |
| if not self.provider.name: | |
| raise NotImplementedError("no real provider was specified") | |
| return Translations( | |
| translations=definitions, | |
| url=url, | |
| engine=self.provider.name, | |
| template="answer/define.html", | |
| ) | |
| def _build_translations_answer(self, definitions: list[Definition]) -> Translations | None: | |
| """ | |
| Turn a list[Definition] into a single Translations answer | |
| - One item per part-of-speech | |
| - Each item carries multiple definitions & a few examples | |
| """ | |
| if not definitions: | |
| return None | |
| by_pos: dict[str, list[Definition]] = {} | |
| for d in definitions: | |
| by_pos.setdefault(d.normalized_pos, []).append(d) | |
| url = next((d.source_url for d in definitions if d.source_url), None) | |
| items: list[Translations.Item] = [] | |
| # Dedupe identical senses | |
| for pos, entries in by_pos.items(): | |
| seen: set[tuple[str, str]] = set() | |
| definitions_text: list[str] = [] | |
| examples: list[str] = [] | |
| for d in entries: | |
| if d.dedupe_key in seen: | |
| continue | |
| seen.add(d.dedupe_key) | |
| if len(definitions_text) < self.max_definitions: | |
| definitions_text.append(d.definition) | |
| for ex in d.examples: | |
| if len(examples) >= self.max_examples: | |
| break | |
| if ex not in examples: | |
| examples.append(ex) | |
| # Title of the card uses the first definition's word | |
| word = entries[0].word if entries else "" | |
| pos = entries[0].normalized_pos if entries else "unknown" | |
| items.append( | |
| Translations.Item( | |
| text=word, # now just "liquid" | |
| transliteration=f"({pos})" if pos != "unknown" else "", | |
| definitions=definitions_text, | |
| examples=examples, | |
| synonyms=[], | |
| ) | |
| ) | |
| return self._make_definitions_answer(definitions=items, url=url) |
Bnyro
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I feel like this is so similar to normal engine implementations, that this should be a normal engine instead of a plugin.
Apart from that we already have implemented something very similar by using !define tree, you can get definitions from all definition search engines we already have.
So in my opinion it would make sense to only take the implemented wiktionary API from this PR to create a wiktionary engine for the define category returning definitions similar to https://github.com/searxng/searxng/blob/master/searx/engines/wordnik.py.
|
@Bnyro hello, thanks for the follow-up! I didn't know that I would prefer to integrate with an existing feature rather than creating a plugin after learning about what you suggested, however, I do believe there is more for your project to benefit from my design than simply a provider. The existing implementation has potential to improve from a usability & UX standpoint:
|
|
I agree, your layout definitely has some aesthetic advantages over the current one. Perhaps it'd make sense to adjust the existing translations template with your suggestions, so that we don't have two maintain two separate templates? This way translations could also benefit from the better UI design :) |
Absolutely! I also completely agree that it would be much easier to converge from a maintainability, and both a backend and UX design perspective. I think it would be great to improve more than just the definitions. If I'm being honest, I learned about SearXNG a few days ago and set up a server for personal use, but I started looking through the codebase yesterday. I don't have enough experience with it to be comfortable modifying a core feature without understanding the implications. It's for that reason specifically I chose the plugin approach, as it doesn't break anything 😊 With that being said, I'd be happy to help however I can within my "green" understanding of the project, and you’re more than welcome to use any part of my code in whatever way fits best. I just wanted to share my implementation for personal use with the hope that it could be beneficial in some way. |
|
To change the style of the existing translations answer template, you can simply:
To convert your wiktionary code into an engine, you can just create a new file here: https://github.com/searxng/searxng/tree/master/searx/engines and read through the developer documentation: https://docs.searxng.org/dev/index.html. Perhaps this should be fairly simple given the fact that you already implemented this plugin. If you have specific questions, please don't hesitate to ask. :) |
|
Great @Bnyro, thanks for the suggestions! I've been digging further into the source and noticed there are already a few dictionary-style providers. I honestly hadn't realized that earlier, very interesting! Just to confirm, are these currently the only engine modules using the
Additionally, from an architectural standpoint, I'd like to understand your stance on generated HTML. Since you mentioned improving the translation/definition framework as a whole, I'd prefer that any refactor toward an engine should also include the opportunity to bring the existing modules in line with a consistent structure. For example, adopting some of the normalization and layout ideas from my plugin. To expand a bit on the HTML standardization, my current implementation preserves formatting like bold, italics, and internal links (rewritten as local SearXNG searches). Since most of the dictionary engines already simplify structured HTML, this could be implemented fairly cleanly. If I restrict the rendered HTML in the template for only a handful of tags, does that approach align with your expectations around sanitization? Finally, I'd like to clarify configuration possibilities. Ideally, it would be great to expose a few small options:
I know scope creep is a common concern and I want to be respectful of that, but at the same time, there are some features like the HTML parsing which would require modifying how the template is rendered, and also, how data is passed into it. So I just want to be clear about where the line is for you on that. I mainly want to make sure these goals fit within your long-term direction before I start moving pieces around. Let me know your thoughts! |
I have to repeat, the concept behind are the Result Types .. searxng/searx/result_types/answer.py Lines 120 to 122 in 2cdbbb2
.. in this concept, a result type like Translations(
...
template="answer/define.html"
) On Client side, we have the CSS (aka LESS) https://github.com/searxng/searxng/tree/master/client/simple/src/less See #5366 (comment) For example, the searxng/searx/result_types/answer.py Lines 156 to 161 in 2cdbbb2
HTML template is in and the CSS layout of the client is in
Don't do it, the engines, plugins and answerers .. the code that produce the If you want a UI different from
see my comment above: #5366 (comment) I hope my explanations have been able to show “where” we want to go... As already mentioned above, this concept of result-types is currently still under development and therefore somewhat difficult to understand .. If I had more time, I would rebuild your PR... However, that would take longer, as I currently have many other unfinished tasks to deal with. Perhaps you would like to try to restructure the code in this PR yourself so that it fits better into our concept of result types? On the other hand, if you have some time and patience with me, we can also do this together. |
|
@return42 as Bnyro and I were discussing before, we were considering a restructure of However, your "don’t do it" framing and the way my questions were brushed aside comes across as fairly dismissive, and if your intent is to collaborate as you suggested, I don't feel like my perspective is being heard. My goal was to clarify architectural intent and find a collaborative path forward, and it seems as though you're firmly against the idea of restructuring the I respect that, and that's why I asked about where your scope lies. If that is in fact your stance, I don't have an interest in continuing to work on this. Though, you're more than welcome to use my code. |
My english isn't the best .. in the core I offered you my support.
If you prefer to restructure existing code .. then yes, please consider a restructure of |
Okay, great. With this perspective, could you both take another look through my previous comment? I want to ensure that we're on the same page, and that I understand the scope of these changes before proceeding with a design. |
Sure, please do so 👍
I think you can do that in a different follow-up PR. |
I don't have a final meaning about the HTML markup. In order to separate data from its representation, we have recently made considerable efforts to remove all HTML tags from the result items of the engines and have begun to build up the result types. The requirement for an inline markup for paragraph-like types has been on my mind for quite some time: We could (as suggested above) use a subset of HTML tags for this purpose, but then we would have to agree on which tags are included in this set, how they can be combined, and much more. In my opinion, this approach is wrong because it would mean we would have to start developing our own markup. Another disadvantage would be the presence of a few HTML tags in the strings of the JSON output. What we need are basic inline markups such as those found here in the reST markup. For this, we can implement a class Something like (untested) ... import functools
import msgspec
from markdown_it import MarkdownIt
class CommonMark(msgspec.Struct):
"""String type with some `basic inline markup`_ (CommonMark_)
.. _basic inline markup: https://docs.searxng.org/dev/reST.html#basic-inline-markup
.. _CommonMark: https://commonmark.org/
"""
p: str
def __post_init__(self):
# normalize whitespaces to one space
self.p = " ".join(self.p.split())
def __str__(self):
return self.p
@functools.cached_property
def html(self):
"""Generates a string with HTML markup from CommonMark (uses markdown-it-py_).
.. _markdown-it-py: https://github.com/executablebooks/markdown-it-py
"""
return (
MarkdownIt("commonmark", {"typographer": True}).enable(["replacements", "smartquotes"]).render(self.p)
)In line 137 we can replace the searxng/searx/result_types/answer.py Lines 131 to 138 in ea4a55f
And in the template we replace
|


What does this PR do?
This PR adds a new plugin:
define, which provides inline dictionary definitions to search results, powered by Wiktionary. I use this feature a lot with search engines, and added this feature to my own instance for that reason. I also thought it would be beneficial to share with the community!When a user searches for a query beginning with
the plugin fetches definitions via the official Wiktionary REST API and renders them as a answer card using a modified
Translationstype.The plugin:
/wiki/...links to equivalent local searches (e.g./search?q=define+term)Implementation notes:
searx.network.get) for requestsDefinitionProvideranswer/define.htmlWhy is this change important?
This plugin gives users quick, inline dictionary definitions without opening another tab or engine result. This is similar to what major search engines provide.
It enhances the SearXNG user experience by:
This makes "define" queries faster, privacy-friendly, and self-contained in the UI.
How to test this PR locally?
settings.yml):Rebuild or reload your dev environment:
Open SearXNG in your browser and verify that "Define" is enabled in Special Queries > Plugins:
From the home page, try queries such as:
define effervescencedefine networkdefine chicken tenderYou should see a compact definition card rendered at the top of the results:
To run automated tests:
make testExpected:
pyright: no errorspylint: 10.00/10pytest: all tests passview my results
Author's checklist
pylintscore: 10.00/10Future ideas:
I'm open to any feedback, and I also can't wait to see how else this project evolves. I've been really happy with this tool in my network!