Incremental indexing of large file system

TStone

Newbie

Join Date: Mar 2013

Posts: 2
- Share
- Tweet
#1

Incremental indexing of large file system

Mar-26-2013, 01:58 PM

I am running an Enterprise evaluation of Zoom and it does a great job.

My problem is that we need to do full content indexing of a very large number of PDF text files. The current directory structure that I am testing against has about 460,000 files averaging about 60K each. It doesn't look as though we can take advantage of the 64-bit incremental indexing in Spider Mode for this purpose and have to do use 64-bit Offline Mode. This requires a full index sweep that takes several hours. I would love to have an incremental update every 15 minutes or so but I cannot see how that can happen. We have another file structure that contains about 1.5 million files that I haven't tried yet.

Do you have any suggestions for our situation? I have to index the file contents. The meta information doesn't have enough data to help.

With regards,
Tom Stone
Tags: None
David

Administrator

Join Date: Dec 2004

Posts: 4709
- Share
- Tweet
#2

Mar-26-2013, 07:08 PM

You can add single new documents, or a list of documents, using command line parameters.

This means you need to track which documents are new yourself however maybe via a script. But on the plus side you don't have a background indexer task continually scanning even when there is no work to be done.
Comment

Announcement