Hi, we have recently bought your search engine and itīs quite easy to get a grip with all the options it has. Iīm having a problem with our indexing. For reasons to long to explain the server where our webpage is hosted has to reset itself (only the software part not the hardware one) every 10 minutes. The indexing takes around 20 minutes so it means that during a few seconds the webpage is not available. The result is that at the end of the indexing process there is a few webpages that can not be found by the Zoom engine. I have looked if itīs possible to use the incremental options to look for those "not really broken links" that appear after one pass but it doesnīt work since the incremental indexing only works on the base of the pages that have been successfully indexed. The option to make a second pass wonīt work either becuse the first generated files will be overwritten. Is there any simple way I can add in a second round the broken links to be checked again and to add the new results to the already genrated before?? Hope I have explained myself right because I know it sounds a bit of a mess
Announcement
Collapse
No announcement yet.
What to do with "artificial" broken links???
Collapse
X
-
I must say, that first of all, it is a very odd situation with having a server which needs to reset itself every 10 minutes. One of the key distinguishing feautres of a server, is that it is required to be up and available reliably and for as long as possible. So I have to question, whether it is realistic for anything to work well within this environment.
Having said that, first thing to consider: can you index the files offline? If the pages/documents are static, then you can make a local copy, or index via a local network, using Offline Mode.
Second, with the incremental approach, you could break up the indexing job by folders, so that the main config only indexes, say, half the files. Then it schedules an incremental "add pages to existing index" job, to add the remaining half to the existing index after the server reset. And repeat both tasks each time you need to re-index.
-
Thanks for anwsering Ray. I know that it sounds very odd but we have to deal with this situation in my center. The problem is that the web page was first built under IIS with an Access database and this system doesnīt support too many visits without having problems of workload. In the last year we begun to be recognized and that means a lot of visits, more than the actual database can support. I know that the solution would be to migrate to a more robust system like SQL or similar but unfourtunately that is beyond my attributions. So the only solution to avoid the blocking of the database is to reset the software server every 10 minutes. So with this state of the question your second solution is probably the more easy to implement since is something I can do without involving anyone from the IT????? department. Thanks for the solution, I will try that and back with the results
Comment
Comment