PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

Indexing stops at same point

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Indexing stops at same point

    I'm using Zoom Search Engine Professional 4.0.1061 and I'm trying to index a users folder full of pdf, tif, doc, xls, ppt, etc. The user has 11,120 files in 704 folders. When I start to index the files, everything works fine until it stops indexing at the same point everytime (which incidently is about 3-5 minutes into the indexing).

    I'm running zoom search engine on a P4 3.0 ghz windows 2000 server, 1GB of ram, and approx 500 + GB of storage. The settings I'm using in zoom are as follows:

    Max. pages to scan 20000
    Max. unique words 50000
    Max. file size scanned 6024
    Max. description length 150

    Multiple threads 4

    If I index an individual folder containing about 400 pdf's it works fine. Any input into why this is happening would be much appreciated.

  • #2
    Also FYI: xlhtml.exe hangs in memory when I try to exit (after the indexing stops and I exit the program)

    Comment


    • #3
      That may be your clue. In fact WrenSoft have just posted a new version of the Excel plugin on their website that includes fixes for issues that seem very similar to yours.
      Mark Gallagher

      Comment


      • #4
        I have updated the Excel plugin and it works. Thank you. Now I need to figure out how it can index all the files and subfolders. It seems to only index around 1,700 files when infact there are 11,120 files in 704 folders. Anyone have any suggestions? Should I index just one huge folder or should I index individual folders? Can one index do the subfolders as well? How should I approach this?

        Comment


        • #5
          Are you using spider mode? If so, take a look at:
          http://www.wrensoft.com/zoom/support...spider_finding

          Offline mode indexes everything within the start directory, but it requires files to be on a local drive rather than on a web server. Offline mode also can not index dynamically generated pages (eg. .php, .asp, etc).

          Generally, you should turn on Verbose mode if you want to find out why certain files are being skipped. It may simply be due to your configuration such as max file size limit, etc.

          You should also note that "TIF" is not a supported file format, assuming this is the TIFF image format. Make sure to exclude this from your extensions list in Zoom.
          --Ray
          Wrensoft Web Software
          Sydney, Australia
          Zoom Search Engine

          Comment


          • #6
            I think I got it nailed, certain folders named with a starting underscore, __Scans, were not being index. I renamed the folders to Scans and they are now being indexed. Regardless of how many files are being index, the summary at the end seems to be incorrect. Is this normal?

            Comment


            • #7
              Can you give us some indication of how or when it is incorrect? For example, a screenshot to show us how the Status tab is inconsistent with the Summary at the end?

              One thing to note is that if you have "Scan files with no extensions" enabled, these files would not be counted in your extensions list (eg. ".html files scanned") and thus it may not add up if you look at it that way. In the upcoming version, there will be an extra counter for "files with no extensions" to make this more obvious.

              Also note that, even if a file is technically stored on disk as "index.html", but if the spider finds it via a URL such as "http://mysite.com/", it would still be counted as a file with no extension.
              --Ray
              Wrensoft Web Software
              Sydney, Australia
              Zoom Search Engine

              Comment


              • #8
                One thing to note is that if you have "Scan files with no extensions" enabled, these files would not be counted in your extensions list (eg. ".html files scanned") and thus it may not add up if you look at it that way. In the upcoming version, there will be an extra counter for "files with no extensions" to make this more obvious.
                You hit the nail right on the head. Everything seems to be working great now! I can't believe the power of this little program. Sure beats spending $2K US on a google mini. My boss loves it. You guys are great!

                Comment

                Working...
                X