I was able to get rid of the problem by enabling the KeepAlive feature in Apache on our webserver. This basically allows for persistent connections so in Zoom's case, there's no 3-way tcp handshake overhead opening and closing connections while it's indexing the site. The downside is that keeping connections open doesn't allow other connections that might be waiting to be served. But this can be minimized by setting the KeepAlive timeout value small (in our case I set it to 2 seconds instead of the default 15). Another benefit to enabling the KeepAlive is that it shaved 7 minutes off our total indexing time.
So if you don't already suggest it to users, I'd say it's a good setting to know especially if the website is large and traffic volume is manageable.
Announcement
Collapse
No announcement yet.
Zoom not picking up all files/desc files - correlation with number of threads
Collapse
X
-
Originally posted by cchan View PostThe higher the number of threads, the more Invalid URLs and missing .desc files there are. Is this something webserver related, like the webserver can't keep up with all the opening and closing of connections/requests?
Leave a comment:
-
Zoom not picking up all files/desc files - correlation with number of threads
It seems like there's a positive correlation between the number of threads Zoom uses and the number of "Invalid URLs" and missing .desc files in the search results. The higher the number of threads, the more Invalid URLs and missing .desc files there are. Is this something webserver related, like the webserver can't keep up with all the opening and closing of connections/requests? We're trying to index a large number of pdf files and their respective .desc files.
And the missing pdf/desc files are always different, so it's pretty random. All the files exist and can be brought up normally in a web browser. Only when Zoom is indexing will it somehow not be able to download the file(s).Tags: None
Leave a comment: