I am having a problem trying to index my site. I ran Zoom Indexer yesterday and created a rather large index of the site. (wordmap and pagetext files are 105mb each).
I noticed that it was finding a lot of pages multiple times and listing them seperately because the name is different but the page is the same. I wasn't sure if there was a way for Zoom to know that the page was the same so I decided to add the special zoom no follow tag around many of the links so that when someone uses the search they won't get a page of identical pages with different URL's. The other change I made since it ran yesterday was that I increased the list of recommended links that I imported from ~56000 to ~120000.
The problem I have today when I tried to index the site (hoping that it would remove a lot of the duplicate pages and reduce the file sizes) is that it gets to one of two pages and stops. CPU usage is 100% and it doesn't try to go to any more pages.
In the status window it says:
DL Thread #2, got URL (http://www.bnbfinder.com/?action=myinns&l=1) off queue
Downloading file http://www.bnbfinder.com/?action=myinns&l=1
It never says that it gets the ready buffer like the other pages before it say. Sometimes it gets past this file but it will stop at a state page.
http://www.bnbfinder.com/Alabama-Bed-and-Breakfast
I haven't been able to run it succesfuly today. Of the five times I tried it stopped at one of those two pages every time.
I am writing this as I am trying to run the index and after about 5 minutes of it sitting on the same page it just jumped to another. So it does seem to be working but it is running extremely slow. It took a total of 2 hours to index the entire site yesterday and it should be faster today since I am reducing the number of links it sees considerably. However it has taken 10 minutes so far and it has only indexed 24 pages. (I have about 25,000 pages).
Have I found or exceeded some sort of limit, perhaps with the recommended links?
I noticed that it was finding a lot of pages multiple times and listing them seperately because the name is different but the page is the same. I wasn't sure if there was a way for Zoom to know that the page was the same so I decided to add the special zoom no follow tag around many of the links so that when someone uses the search they won't get a page of identical pages with different URL's. The other change I made since it ran yesterday was that I increased the list of recommended links that I imported from ~56000 to ~120000.
The problem I have today when I tried to index the site (hoping that it would remove a lot of the duplicate pages and reduce the file sizes) is that it gets to one of two pages and stops. CPU usage is 100% and it doesn't try to go to any more pages.
In the status window it says:
DL Thread #2, got URL (http://www.bnbfinder.com/?action=myinns&l=1) off queue
Downloading file http://www.bnbfinder.com/?action=myinns&l=1
It never says that it gets the ready buffer like the other pages before it say. Sometimes it gets past this file but it will stop at a state page.
http://www.bnbfinder.com/Alabama-Bed-and-Breakfast
I haven't been able to run it succesfuly today. Of the five times I tried it stopped at one of those two pages every time.
I am writing this as I am trying to run the index and after about 5 minutes of it sitting on the same page it just jumped to another. So it does seem to be working but it is running extremely slow. It took a total of 2 hours to index the entire site yesterday and it should be faster today since I am reducing the number of links it sees considerably. However it has taken 10 minutes so far and it has only indexed 24 pages. (I have about 25,000 pages).
Have I found or exceeded some sort of limit, perhaps with the recommended links?
Comment