PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

Some Docs not indexing part of the name

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Some Docs not indexing part of the name

    Problem is difficult to explain but I'll give it my best effort.....

    I have a folder that holds .doc files. The format for all documents in this folder is: IBM Server Refresh Status_1020.doc where the "1020" changes from document to document. (it represents the month and the day)

    Problem #1:If I was to try and search for "1020", It does not find the document.
    Problem #2:If I was to try and search for "IBM Server Refresh Status_1020.doc", it finds the document, however it's like 8th in the list. The entire document name is highlighted and documents that appear before it have less highlighed than this document.

    Example when searching for "IBM Server Refresh Status_1020.doc (bold is what is what is highlighted)

    1. 06 Chg Mgmt Key Indicator Apr.xls
    ... membership for compliance with IBM management and ITAN project and software ... FTT_NORMAL Installation of ITAM Server on the following servers: BLDSUN64 ... DMX3000 for their host refresh. 131221 38808.25*CATEGORY 3 ...
    Terms matched: 3 - Score: 5946 - 9 May 2006 - URL: http://usoperations/KIR/Apr06/06 Chg Mgmt Key Indicator Apr.xls

    5. IBM Server Refresh Status_1103.doc
    ... IBM Server Refresh Activities: 11 / 03 / 06 ... Executive Summary: Behind Schedule / Behind Plan This report reflects ... IBM Server Refresh Activities: 11 / 03 / 06 ...
    Terms matched: 3 - Score: 4336 - 3 Nov 2006 - URL: http://usoperations/Project_Mgmt_Reports/Key Focus Projects/Project Status Week Ending 110306/IBM Server RefreshStatus_1103.doc

    8. IBM Server Refresh Status_1020.doc
    ... IBM Server Refresh Activities: 10 / 20 / 06 ... Executive Summary: Behind Schedule / Behind Plan This report reflects ... IBM Server Refresh Activities: 10 / 20 / 06 ...
    Terms matched: 3 - Score: 3824 - 19 Oct 2006 - URL: http://usoperations/Project_Mgmt_Reports/Key Focus Projects/Project Status Week Ending 102006/IBM Server Refresh Status_1020.doc

    As you can see, the scoring seems off. #8 (I would think) would show up first.
    AND, search on "1020" should return this document, but it doesn't.

    I've tryed unchecking all the boxes that join words (i.e. dots, dashed, underscores, etc.) but this seems to only make matters worse..

    HELP!!!

  • #2
    Having a good file name match doesn't automatically mean the document is most relevant to the search terms. The scoring depends on many factors, not just the file name.

    From the Zoom configuration window you can increase the weight of the file names however.

    You could also do an exact phrase match. i.e. search for,
    "IBM Server Refresh Status_1020.doc"
    instead of
    IBM Server Refresh Status_1020.doc
    This should solve the problem.

    You also didn't say if you were doing an Boolean OR search or an AND search. In this case the AND search (all words) will be more effective.

    Comment


    • #3
      Makes sence, however when I do an "OR" the results are as explained in the above example. If I were to perform an AND search for "Sev3 Auto Ticketing Status_1110.doc" ( with or without the quotes) I get 'no results'.

      Comment


      • #4
        The text, Status_1110.doc, could be interpreted 4 ways, depending on the options you select in the Zoom configuration window.

        1) It could be a single word,
        Status_1110.doc

        2) It could be 2 words
        Status_1110
        doc

        3) It could be 2 words like this
        Status
        1110.doc

        4) Or it could be 3 words.
        Status
        1110
        doc

        So you need to decide how dots (.) and underscores (_) are best treated for your site.

        You also need to ensure that you have enabled the indexing of file names, which is off by default. (In fact I think this is the most likely explaination for the zero result).

        Comment

        Working...
        X