PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

HELP! Indexer skipping all files

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • HELP! Indexer skipping all files

    Hello all,

    I'm using zoom to index a large number of legal documents. Up until yesterday, it's worked beautifully. However, yesterday when I tried to reindex the folder containing a certain set of documents to include new documents, the indexer starting skipping every file. When I tried to search documents through the web site search, nothing came up at all. I'm not sure what happened, but somehow the files are misplaced, no longer searchable, or some other unimaginable explanation that my tech-challenged brain can't even conceive of. I've tried a number of things -- changing configuration settings, etc. but I'm finally stumped. Does anyone have any suggestions? Please help!!

    Thanks,

    KKeleher

  • #2
    It is normal that if all files are skipped (and not indexed) then you get no search results (as the index is empty).

    So you need to look at why the files are being skipped.

    Are you using Spider mode or Offline mode?

    It would help if you posted part of the index log (you can save it to a text file from the file menu in the Zoom indexer). It should contain the reason for skipping files.

    Also see these FAQ.
    Q. Why are some of my pages being skipped by the indexer?

    Q. Why are links in my Javascript menus being skipped?

    Q. I am indexing with spider mode but it is not finding all the pages on my web site

    Comment


    • #3
      Thank you very much for your help. I actually reviewed those FAQ before posting, but didn't find the information very useful in my situation. I've already disabled every "skip" function I can find in the configurator and none of the files are password protected or hidden. I'm using the Offline mode. I've posted part of my index log below. It seems that the reason given is that the files are blocked by the extensions list. Is this something I can fix in the configurator?

      Thanks again for the help!

      14:16:13 - [SKIPPED] Skipping E:\OCR Projects\Abstrax\Dell Docs\DELL 0074899 - 0075022\OCR\DELL_0074986.TIF (Blocked by extensions list)
      14:16:13 - [SKIPPED] Skipping E:\OCR Projects\Abstrax\Dell Docs\DELL 0074899 - 0075022\OCR\DELL_0074987.txt (Blocked by extensions list)
      14:16:13 - [SKIPPED] Skipping E:\OCR Projects\Abstrax\Dell Docs\DELL 0074899 - 0075022\OCR\DELL_0074987.TIF (Blocked by extensions list)
      14:16:13 - [SKIPPED] Skipping E:\OCR Projects\Abstrax\Dell Docs\DELL 0074899 - 0075022\OCR\DELL_0074988.txt (Blocked by extensions list)
      14:16:13 - [SKIPPED] Skipping E:\OCR Projects\Abstrax\Dell Docs\DELL 0074899 - 0075022\OCR\DELL_0074988.TIF (Blocked by extensions list)
      14:16:13 - [SKIPPED] Skipping E:\OCR Projects\Abstrax\Dell Docs\DELL 0075068 - 0075072\OCR\DELL_0075070.txt (Blocked by extensions list)
      14:16:13 - [SKIPPED] Skipping E:\OCR Projects\Abstrax\Dell Docs\DELL 0075068 - 0075072\OCR\DELL_0075070.TIF (Blocked by extensions list)
      14:16:13 - [SKIPPED] Skipping E:\OCR Projects\Abstrax\Dell Docs\DELL 0075068 - 0075072\OCR\DELL_0075071.txt (Blocked by extensions list)
      14:16:13 - [SKIPPED] Skipping E:\OCR Projects\Abstrax\Dell Docs\DELL 0075068 - 0075072\OCR\DELL_0075071.TIF (Blocked by extensions list)
      ...
      14:16:13 - [FILEIO] Writing index data for CGI/Win32 search... (Please wait)
      14:16:13 - [FILEIO] Created pagedata data file (zoom_pagedata.zdat)
      14:16:13 - [FILEIO] Created pagetext data file (zoom_pagetext.zdat)
      14:16:13 - [FILEIO] Created pageinfo data file (zoom_pageinfo.zdat)
      14:16:14 - [FILEIO] Created dictionary data file (zoom_dictionary.zdat)
      14:16:14 - [FILEIO] Created wordmap data file (zoom_wordmap.zdat)
      14:16:14 - [FILEIO] Created script settings file (settings.zdat)
      14:16:14 - Indexing completed at Thu Aug 14 14:16:14 2008
      14:16:14 - INDEX SUMMARY
      14:16:14 - Files indexed: 3
      14:16:14 - Files skipped: 34866
      14:16:14 - Files filtered: 0
      14:16:14 - Files downloaded: 0
      14:16:14 - Unique words found: 526
      14:16:14 - Total words found: 3
      14:16:14 - Avg. unique words per page: 175.33
      14:16:14 - Avg. words per page: 1
      14:16:14 - Start index time: 14:15:44 (2008/08/14)
      14:16:14 - Elapsed index time: 00:00:30
      14:16:14 - Errors: 0
      14:16:14 - Total bytes scanned/downloaded: 12
      14:16:14 - File extensions:
      14:16:14 - .php indexed: 0
      14:16:14 - .asp indexed: 0
      14:16:14 - .cgi indexed: 3
      14:16:14 - .aspx indexed: 0
      14:16:14 - .pl indexed: 0
      14:16:14 - .php3 indexed: 0
      14:16:14 - No extensions indexed: 0
      14:16:14 - Cleaning up memory used for index data... please wait.
      14:16:14 - Finished cleaning up memory.

      Comment


      • #4
        Take a look at the Scan Options tab in the Configuration. Make sure you have all the extensions listed that you want ZoomSearch to index. If the file extension isn't on the list, that file will be skipped.

        Andrew

        Comment


        • #5
          Andrew's spot on.

          My guess is you added ".TIF" extension at one point, and then forgot to save your configuration changes. So when you came back later to re-index, it was without the ".TIF" extension. Make sure you add this to your scan extensions and also, make sure you save your configuration after this change.
          --Ray
          Wrensoft Web Software
          Sydney, Australia
          Zoom Search Engine

          Comment

          Working...
          X