-
Notifications
You must be signed in to change notification settings - Fork 24
Find/use 'next'/'prev' links not identified within HTML tags #81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Find/use 'next'/'prev' links not identified within HTML tags #81
Conversation
+ w3m parses HTML tags to try to intelligently guess the 'next' and
'previous' pages for a URL so it proceed there when a page beginning
or end is reached. However, some/many website software don't embed
that information within HTML tags, and they rely solely on the
plain-text between the opening and closing HTML 'A' tag. An example
of this is the sofwtare for the emacs mailing list archive! This
commit adds logic to find and use that information.
+ This commit does introduce new behavior in the patched functions
that change the meaning of the prefix-arg!
+ The most convenient way to give the user flexibility to change
his/her mind about which text labels to use for navigation was to
use the prefix-arg for two of the scroll commands. However, those
functions were already using the prefix-arg for an option to
scroll n lines instead of a screen-full. After giving the matter
thought, it seemed to me that someone wanting finely-tuned
scrolling could/should/was probably using two other functions
anyway (w3m-scroll-up and w3m-scroll-down) which default to one
line and have the optional prefix-arg for n lines.
+ Minimizing the effect of the behavior change
+ The commit adds a defcustom for number of default lines to scroll
when not performing fine-tune scrolling (ie. when not using
functions w3m-scroll-up and w3m-scroll-down). When that variable
is NON-NIL, functions w3m-scroll-up-or-next-url and
w3m-scroll-down-or-previous-url use that number instead of a
screen-full.
+ The commit adds a function w3m-set-scroll-interval to conveniently
change the default scroll amount of the defcustom, but only for
the current session.
+ The result is that scrolling is more convenient because if you want
someone who wants to scroll n lines instead of a screen-full will
probably want to do that repeatedly. Without the patch, that user
would need to use the prefix-arg and numeric entry for each scroll.
With this commit, the user only needs to set the value once, and can
do so as a command, without having to manually evaluate a variable.
The use does need to remember or guess that in order to "set the w3m
scroll interval" you perform M-x w3m-set-scroll-interval.
+ Benefit of logic at point-of-use. The logic is not performed during
page parsing (all pages), only when the feature is needed (very
rarely).
|
An improvement would be to replace the hard-coded regexes with a defcustom list of CONS. That way it could support other languages and other pairs of keywords. |
+ Use a defcustom to support multiple navigation labels in multiple languages. + BUGFIX: When choosing a label such as 'next in thread' or 'previous in thread', one will reach an endpoint page that isn't the end of the entire series of pages, so prompt the user for a new choice without forcing the user to run the command a second time, but with a prefix arg.
+ Tough typing with BIDI
|
;; I'm not sure how to make a reply sent in the GitHub page so to CC'd Thank you for the great feature (if works). But have you tried Though (while ... (add-to-list (quote VAR) VAL t) ...) with: (while ... (cl-pushnew VAL VAR) ...) The unused Regards, |
+ scrolling to top of page was occassionally was sometimes cycling back to bottom + The English regex '.*back' was improperly catching strings like "background" "back-and-forth".
2e81154 to
e77ed12
Compare
|
This message is really a question about w3m-forms, but let me start with
some documentation of what I'm up to in continuing to work on the
scroll-or-next/previous page.
I decided to try the feature with the duckduckgo and google search
engines. Each is a special case that needs a different coding strategy.
Duckduckgo uses HTML forms instead of 'Next' or 'Prev' links, and Google
uses a unique '<' for its previous page link, but has many '>' links,
only one of which is for its next page.
Since those two websites are so commonly used, I'm willing to make the
investment to code something just for them, but I suspect that the
techniques they use are shared by many other websites, so the code will
find more general application.
In order to write the code for the Duckduckgo case, I need to deal with
file w3m-forms.el, which isn't extensively documented, so it would make
things much easier just to ask (maybe only one?) question: I see that
there is a function `w3m-form-submit' that takes a `w3m-form' vector
object as a required argument, and I see the list of forms in variable
`w3-forms' but I don't see how I can identify a form at point. When I
evaluate text properties, I always get NIL.
The code for Google turns out to be much simpler: The urls for the
"correct" forward '>' link is the only one with a query component
'&start=' that has a value greater than zero.
…--
hkp://keys.gnupg.net
CA45 09B5 5351 7C11 A9D1 7286 0036 9E45 1595 8BC0
|
+ Introduces a defcustom data structure w3m-page-navigation-sites to collect website urls and functions to be used to navigate their pages forward/backward. + Includes the pair of functions for the google-search style of page design.
+ simplify the algorithm to only look inside the url + now the algorithm becomes language-agnostic
+ simplify the algorithm to only look inside the url + now the algorithm becomes language-agnostic
…s-w3m into bb_ambiguous_navigation
+ the duckduckgo hadling has a bug in that POINT of new pages is not set to POINT-MIN.
In theory, this commit should have been a separate PR from "ambiguous navigation", but the code is too interwoven, and the commit is so small. The issue is that prior code allows scrolling to continue even when (poinnt-max) is already visible. This makes it different than the behavior of other emacs buffers. In practice, I found this behavior especially annoying because it messed up the behavior of a third-party package 'ya-scroll' that I've started using in emacs-nox. Even without that, the behavior makes no sense. One anomaly and inconsistency remains. The prior code calculates scrolling extent based upon (skip-chars-backward "\t\n\r ") from POINT-MAX instead of from POINT-MAX alone. In praactice, I see that pages do routinely present several blank lines at their end, and I'm not commited either way how to handle the case. For the line function, I used POINT-MAX, and for the paging function I retained the existing idiom, so paging down present the 'sensible' end, and a persistent user can see the extra blank lines by manually scrolling line by line.
w3m parses HTML tags to try to intelligently guess the 'next' and
'previous' pages for a URL so it proceed there when a page beginning
or end is reached. However, some/many website software don't embed
that information within HTML tags, and they rely solely on the
plain-text between the opening and closing HTML 'A' tag. An example
of this is the sofwtare for the emacs mailing list archive! This
commit adds logic to find and use that information.
This commit does introduce new behavior in the patched functions
that change the meaning of the prefix-arg!
his/her mind about which text labels to use for navigation was to
use the prefix-arg for two of the scroll commands. However, those
functions were already using the prefix-arg for an option to
scroll n lines instead of a screen-full. After giving the matter
thought, it seemed to me that someone wanting finely-tuned
scrolling could/should/was probably using two other functions
anyway (w3m-scroll-up and w3m-scroll-down) which default to one
line and have the optional prefix-arg for n lines.
Minimizing the effect of the behavior change
The commit adds a defcustom for number of default lines to scroll
when not performing fine-tune scrolling (ie. when not using
functions w3m-scroll-up and w3m-scroll-down). When that variable
is NON-NIL, functions w3m-scroll-up-or-next-url and
w3m-scroll-down-or-previous-url use that number instead of a
screen-full.
The commit adds a function w3m-set-scroll-interval to conveniently
change the default scroll amount of the defcustom, but only for
the current session.
The result is that scrolling is more convenient because if you want
someone who wants to scroll n lines instead of a screen-full will
probably want to do that repeatedly. Without the patch, that user
would need to use the prefix-arg and numeric entry for each scroll.
With this commit, the user only needs to set the value once, and can
do so as a command, without having to manually evaluate a variable.
The use does need to remember or guess that in order to "set the w3m
scroll interval" you perform M-x w3m-set-scroll-interval.
Benefit of logic at point-of-use. The logic is not performed during
page parsing (all pages), only when the feature is needed (very
rarely).