PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

Show PHP files first and PDF files second?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Show PHP files first and PDF files second?

    Hello, I am trying to figure out how I can show PHP files in the results BEFORE PDF files as my results are simply flooded with PDF files and are not very accurate at all. I've tried mucking around with the +1 -1 attributes to no avail.

    I have also attached my index summary, is there any way to get it to index faster with the same accuracy?

    18:12:25 - INDEX SUMMARY
    18:12:26 - Files indexed: 628
    18:12:27 - Files skipped: 2442
    18:12:27 - Files filtered: 0
    18:12:27 - Files downloaded: 638
    18:12:27 - Unique words found: 72374
    18:12:27 - Total words found: 3108405
    18:12:29 - Avg. unique words per page: 115.25
    18:12:30 - Avg. words per page: 4949
    18:12:33 - Start index time: 13:37:04 (2008/05/01)
    18:12:35 - Elapsed index time: 04:35:21
    18:12:36 - Errors: 18
    18:12:38 - URLs visited by spider: 646
    18:12:41 - URLs in spider queue: 0
    18:12:41 - Total bytes scanned/downloaded: 385631231
    18:12:43 - File extensions:
    18:12:45 - .htm indexed: 0
    18:12:46 - .html indexed: 0
    18:12:46 - .php indexed: 134
    18:12:47 - .pdf indexed: 491
    18:12:47 - .xls indexed: 0
    18:12:47 - .ppt indexed: 0
    18:12:47 - .doc indexed: 2
    18:12:47 - Cleaning up memory used for index data... please wait.
    18:13:08 - Finished cleaning up memory.

  • #2
    You should take a look at the "Content density" weighting option. This can be found on the "Weightings" tab of the Configuration window, and it is designed to prevent the "swamping" effect of having large PDF documents always appearing at the top of your results.

    Here's the description from the FAQ page on page scores and rankings in Zoom:

    You can also specify an adjustment method for Content density. This is an automatic weighting adjustment that is made by the Indexer, based on the word density of the page. With "Standard adjustment", the weighting of words found in a large file (such as a 50+ page PDF document) will be lowered so as to prevent such files from swamping the results and always considered the most relevant. This will effectively give preference to small and medium sized documents. "Strong adjustment" provides an even greater level of scaling, and "No adjustment" would disable this feature so that all files are treated equally.
    --Ray
    Wrensoft Web Software
    Sydney, Australia
    Zoom Search Engine

    Comment

    Working...
    X