[htdig3-dev] Extra word-characters attribute: extra_word_characters


Hans-Peter Nilsson (hp@bitrange.com)
Thu, 11 Mar 1999 21:25:11 -0500 (EST)


I plan to add a new attribute: extra_word_characters.
It is the opposite (or something) to valid_punctuation, it marks a
(possibly) non-alphanumeric as a valid word-character.

This way (and no other I know of), I can make "_" characters part of
words, and searchable as such.

A (hopefully) positive side-effect is that people having problems making
their systems understand their locale (i.e. it is broken in that it
handles everything as the "C" locale) can state characters here that the
locale would normally handle.

Examples:
 extra_word_characters: _
 extra_word_characters: "åäöÅÄÖ"

(If you didn't get the last one, don't worry.)
Specifying characters handled by the locale as isalpha would be a no-op.

Comments welcome.

brgds, H-P

------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
htdig3-dev@htdig.org containing the single word "unsubscribe" in
the SUBJECT of the message.



This archive was generated by hypermail 2.0b3 on Thu Mar 11 1999 - 18:40:38 PST