PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

Help withAcronyms and Stemming Problems

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help withAcronyms and Stemming Problems

    We are in the process of upgrading our search engine to Zoom version 6. The introduction of stemming to the new release is on the whole an excellent feature, however it is causing some issues when searching for acronyms.

    As with the previous version we have included words like 'it' and 'at' in the skip list.
    Previously an acronym such as 'ATS' or 'ATE' was found via the zoom search, however With the introduction of stemming I note that this is no longer found (presumably due to the fact that the acronym stems from a word in the skip list)

    Since I have tested this on both version 5 and 6 I now recogize that searches for acronyms such as IT (as in 'IT projects') are not found in either version of zoom.

    We tried removing 'at' and 'it' from the skipped words list but this increased the search file sizes and made the returns, when searching for 'ATS' or 'ATE' unusable as they were buried amongst a large number of returns for 'at'.

    We also tried searching for "IT projects" and again due to stemming was returned instances where phases such as "all its projects" are used as well as "IT project".

    Is there any way round this ? I couldn't find any case configurable options.

    We are using Zoom in 'Offline mode' on as ASP platform using IE 6 to test.

    Apologies if this is already discussed in another post - I did a quick search and was unable to find similar

    Thanks

    Andy

  • #2
    Yes, this is one issue with stemming that was always going to be challenging. We are considering adding the ability to specify a list of words to exclude from stemming, but think it might be tricky for most users (as stemming is not an obvious concept). But having this would allow you to simply add acronyms to a list which tells Zoom to not stem the words. It's on our list of things to consider for V6.1.
    --Ray
    Wrensoft Web Software
    Sydney, Australia
    Zoom Search Engine

    Comment

    Working...
    X