PassMark Logo
Home » Forum


No announcement yet.

Stemming now added to V6!

  • Filter
  • Time
  • Show
Clear All
new posts

  • David
    You can find system requirements here

    Leave a comment:

  • David
    We only have 2 paid up customers in Poland. So it doesn't make sense to spend any time on Polish stemming.
    We could look at it as a paid consulting job if it was important for a particular customer.

    Leave a comment:

  • Bruno
    No polish? Anyone knows when they add polish language?

    Leave a comment:

  • kaufenpreis
    Ray, thank you for your details answer.

    Leave a comment:

  • Ray
    We had briefly investigated this. The problem with Polish is that you cannot have a stemming method that is entirely algorithm based -- unlike all of the above mentioned (and already implemented) languages.

    Instead, Polish would require a dictionary or lookup table. So extra data would be needed for the effectiveness of the stemming. This would alter the requirements of the search script (e.g. number of files, space requirements, etc.) and isn't something we can add easily without bigger changes.

    Leave a comment:

  • kaufenpreis
    I want to see polish stemming too. Please add.

    Leave a comment:

  • paul89
    No polish? Anyone knows when they add polish language?

    Leave a comment:

  • Ray
    Stemming works for Danish, Dutch, English, Finnish, French, German, Hungarian, Italian, Norwegian, Porter, Portugese, Romanian, Russian, Spanish, Swedish, Turkish.

    Leave a comment:

  • noface0711
    I would like to know which the 16 languages are that stemming works for. Couldn't find it in the article. I think it's a great feature

    Leave a comment:

  • Ray
    There are no plurals or past and present tense in Chinese. And these are the main issues that stemming addresses.

    Leave a comment:

  • David
    The whole concept of Chinese language stemming doesn't really make sense (as far as I know).

    Leave a comment:

  • noface0711
    Does that mean that the stemming function does not work for Chinese as well?

    Leave a comment:

  • David
    V6 is now old. In case you didn't notice this post was started 2 years ago.

    You can add your own language translations from the languages configuration window.

    Leave a comment:

  • vinahost
    dear admin.
    I would love to use this V6 version, but need some support when used.
    please tell me Vietnamese language support v6?

    Leave a comment:

  • Ray
    Yes, it is mentioned under "Known limitations" here:

    There is no practical/efficient way to get the "jump to highlighting" script (which is a small JavaScript that runs on each of your content pages) to perform stemming, or to pass it a list of all the matches found on that page. So unless we blow out the complexity of that highlighting script (such that it has a stemming algorithm and we perform a stemming comparison against every word on the page -- this is really not wise considering the script is to be integrated to every page on so many different websites -- many of which have alot of other JS running already and it may conflict in functionality not to mention execution time).

    As noted, we simply cannot "highlight" every occurrence that is considered by the actual indexing and matching algorithm (which is much more complicated and has much more resources available, there is also synonyms and diacritics, etc.), without costing elsewhere (e.g. script complexity, more integration problems, larger download per page and so slower access times, etc.).

    So the jump to highlighting script serves as a highlight of some occurrences, not as an accurate representation of what have been matched.

    Leave a comment:
