PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

Indexing PDFs and Word Docs - Advice

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Ray
    replied
    You can send your contact files to zoom [at] wrensoft (dot) com.

    Please reference this thread in your email.

    Is the problem occurring for both your CD search and your web site search? Are you using spider mode or offline mode for your web site search?

    Leave a comment:


  • jnull
    replied
    Ray I would be happy to send you my config files. To where should I send them. I have two versions of this search, one for a CD, thus on my local drive in a different folder that gets burnt to the CD, and one for my web site, which is on another drive and uploaded to our web server.

    Originally posted by Ray View Post
    I searched for the word "test" and I see that there are duplicate URLs. This is unusual, it shouldn't happen normally.

    Can you tell us if:
    a) You have modified "search.php" or "settings.php" or any of the .zdat index files generated by Zoom.
    b) You are mixing files from different indexing sessions. Note that "search.php" and "settings.php" are essential files that are part of an index, and cannot be mixed from one session with another's.
    c) Are you using Offline Mode or Spider mode? Can you send us a copy of your .zcfg configuration file with your indexing configuration and we can take a closer look.

    Leave a comment:


  • kpa
    replied
    Complete stab in the dark here but I'm assuming you probably have several hundred files at least. I had a issue drawn to my attention the other day on our Intranet. There appeared to be several instances of the same file. Visually the file references looked identical which was odd because the Intranet is configured to treat files with the same name as a new version of an existing file. On very close examination some file names had an additional space or two in them. With the proportionality of fonts these days, to the naked eye it was almost impossible to pick up.

    Leave a comment:


  • Ray
    replied
    I searched for the word "test" and I see that there are duplicate URLs. This is unusual, it shouldn't happen normally.

    Can you tell us if:
    a) You have modified "search.php" or "settings.php" or any of the .zdat index files generated by Zoom.
    b) You are mixing files from different indexing sessions. Note that "search.php" and "settings.php" are essential files that are part of an index, and cannot be mixed from one session with another's.
    c) Are you using Offline Mode or Spider mode? Can you send us a copy of your .zcfg configuration file with your indexing configuration and we can take a closer look.

    Leave a comment:


  • jnull
    started a topic Indexing PDFs and Word Docs - Advice

    Indexing PDFs and Word Docs - Advice

    HI - I am using 7.0 build 1010.

    At www.chenetwork.org/dvd
    user: CHEDVD
    pass: Access2015

    We have nearly 7gb of pdfs and some Word docs which I've indexed. Our problem is that there are multiple files, this repetitive search results for the same file. Looking for solutions to reduce this and suggestions for best practices when indexing such a large body of pdfs and docs?

    Thank you!
Working...
X