Just updated to version 6. One issue with the stemming feature I am finding is with highlighting. If I search for a word like domain, I get pages with both domain and domains. This is expected. However, if I click on a result that takes me to a page with domains, there is no highlighting on the page. Has anyone else mentioned this? The only way I have found to make sure the page shows highlighting is to disable stemming which defeats the purpose.
Thanks.
Announcement
Collapse
No announcement yet.
Stemming now added to V6!
Collapse
X
-
There is no stemming functionality for Chinese. Linguistically I don't see how that would work either. There is no plural or singular forms of words, nor is there present and past tense in the Chinese language and most asian languages that we are aware of.
Leave a comment:
-
Does that mean that the stemming function does not work for Chinese as well?
Leave a comment:
-
Russian is supported with a few minor exceptions. See,
http://www.wrensoft.com/zoom/support/languages.html
For Russian stemming you need to use the CGI option.
Leave a comment:
-
They are listed on the languages window in the Zoom configuration. (You need to select the CGI script option first however).
Leave a comment:
-
I would like to know which the 16 languages are that stemming works for. Couldn't find it in the article. I think it's a great feature.
Leave a comment:
-
The stemming algorithm is very language dependent. It doesn't make sense for most asian languages where there are no linguistic concepts such as plurals or verbs.
Leave a comment:
-
Stemming and single-case languages
I notice that stemming is disabled when "support for single-case languages (ie asian)" is enabled.
Is this intentional? I can't use both?
Thanks
Leave a comment:
-
Stemming now added to V6!
We can now confirm that V6 will feature STEMMING.
This is a much requested feature, that when enabled, search results will match similar words or words which are derivatives of each other (e.g. plurals). For example, searching for the word "fish" will return pages containing the singular and plural words variates "fish", "fishes", "fishing", etc.
Adding this feature required some significant changes to the index file format and the way we index and search words, but we are glad to see that the end results seem to be worth the effort.
Stemming will not be available for JavaScript. The PHP and ASP scripts will only support English stemming, while the CGI version features improved stemming and also stemming support for 16 languages.
The feature will be enabled by default in V6. But you may want to turn it off, if for example, it is absolutely critical that your website differentiates between "booking", "booker", "book", etc.
More information on V6 here.
Leave a comment: