I am running an Enterprise evaluation of Zoom and it does a great job.
My problem is that we need to do full content indexing of a very large number of PDF text files. The current directory structure that I am testing against has about 460,000 files averaging about 60K each. It doesn't look as though we can take advantage of the 64-bit incremental indexing in Spider Mode for this purpose and have to do use 64-bit Offline Mode. This requires a full index sweep that takes several hours. I would love to have an incremental update every 15 minutes or so but I cannot see how that can happen. We have another file structure that contains about 1.5 million files that I haven't tried yet.
Do you have any suggestions for our situation? I have to index the file contents. The meta information doesn't have enough data to help.
With regards,
Tom Stone
My problem is that we need to do full content indexing of a very large number of PDF text files. The current directory structure that I am testing against has about 460,000 files averaging about 60K each. It doesn't look as though we can take advantage of the 64-bit incremental indexing in Spider Mode for this purpose and have to do use 64-bit Offline Mode. This requires a full index sweep that takes several hours. I would love to have an incremental update every 15 minutes or so but I cannot see how that can happen. We have another file structure that contains about 1.5 million files that I haven't tried yet.
Do you have any suggestions for our situation? I have to index the file contents. The meta information doesn't have enough data to help.
With regards,
Tom Stone
Comment