PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

500000 words limit

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • 500000 words limit

    First quistion.
    Is it in the file "zoom_dictionary.zdat" all the (500000) words are stored?

    Second quistion
    I have "words" like
    281624 68698270
    AV21100QP50 68698284
    and
    AXXDVDCDR 68152578
    and
    Zongangeavomes 75238564
    fmupSuiguinil 75238588

    What is the number after the words and what does "-1" mean?

    How to limit unique words to only real words? Possible to not index numbers? Can I extract the redundant words and put them in the Skip options?

    Any suggestions to limit unique words?
    Last edited by Lars; May-26-2008, 09:49 PM. Reason: adding text

  • #2
    The zoom_dictionary.zdat file contains the words found by the indexer, plus other details to support various features. The large numbers after the words are record pointers which allow fast searching for the word in other files. This is really an internal file and you shouldn't need to understand the entire content of the file.

    How to limit unique words to only real words?
    What is a real word? And in what language are you referring to?

    If you are trying to remove 100,000's of words from the dictionary, skipping them one at a time isn't the efficient way to do things. Better to understand where the words are coming from and remove that content source.

    If you want to know where a word like "AV21100QP50" was found, do a search for that word using Zoom. It might be a document name, e.g. AV21100QP50.PDF

    If you post the URL for your search function we might be able to comment more.

    Comment

    Working...
    X