I'm looking for ways to offload search indexing so that my main computer doesn't have to be running for hours overnight. Would a proxy server be good for that, and if so, how should I set it up? There doesn't seem to be any details about proxy servers in the documentation. Thanks!
Announcement
Collapse
No announcement yet.
Why use a proxy server?
Collapse
X
-
Are you doing spider indexing or offline indexing?
Is the content you are indexing on the same machine as the indexer. Or are the content server and indexing machine sperate machines? If separate, then which one is the "main" computer you refer to?
If you are using two machines, then sticking a 3rd proxy machine between them, isn't going to stop any of the 3 computers from being used. You'll just have 3 machines doing stuff instead of 2.
-
David, sorry for the delay in responding. I didn't receive an email notification of your response.
I'm not using a proxy server at all. I'm just wondering what the purpose of it would be. We would be doing spider indexing, and I'm wondering whether it would lead to faster indexing if we used a proxy. Thanks!
Comment
-
Do you know what a proxy is? It is just a relay. So adding more steps in the process of getting content from the web isn't going to make things faster.
Things that make indexing web content faster are,
- Faster internet connection.
- Lower latency. (i.e. moving the indexing machine close to the web server, or even on to the same machine).
- Caching of web content (could be local caching, CDN, or on the web server itself, depending on what is being indexed)
- Faster web server hardware and faster hardware in the machine doing the indexing (e.g. more RAM, using fast SSDs)
- Doing partial, incremental indexing
- Doing indexing while the server isn't under high load
- Selecting optimal settings in Zoom (RAM drives for temp folders and more than 1 thread)
- Only indexing what you really need to (pro-actively exclude sites, folders and files you don't need in the index).
Comment
Comment