PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

Certain search queries causing 500 Internal Server Error

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • David
    replied
    Yes, it might have run out of ram in 32bit, but should never crash. It should finish up gracefully.
    We'll try it from here and let you know.

    Leave a comment:


  • bhtech
    replied
    Note: It looks like our Version 6 was run in 64bit - and I ran the Version 7 in 32bit - could this be the reason?

    Leave a comment:


  • bhtech
    replied
    Hi,

    So I upgraded to Version 7 this morning, loaded in my configuration file and kicked off the indexing.

    It seemed to be running normally (I've never seen it run before as it goes of a scheduled task).

    It got to 3 hours and 51 minutes and then came up with an error:

    Critical error: Terminated indexing due to core engine not responding.

    In the almost four hours it was running it indexed 24,291 pages - with 16,290 URL's in the spider queue when it ended.

    I will send you a zipped copy of the log.

    Please let me know if I can give you any other information about the error.

    Thanks

    Leave a comment:


  • bhtech
    replied
    Fair enough.

    I will do the upgrade to V7(trial) and start an index of the site tomorrow morning.

    Unfortunately we are running a newsletter script on the server today, so it would be too server intensive to run both.

    I will post back with how I go.

    Thanks again!

    Leave a comment:


  • David
    replied
    Rather than spending any more time debugging V6, I am going to send you trial V7 key to see if the problem happens in V7.
    I'll also get a copy of your configuration file, so we can see if we can produce a corrupted index here.

    Leave a comment:


  • bhtech
    replied
    Also, our Optimization settings were set to the Fastest Search for that index - that may be hindering the amount of pages indexed. I have bumped that down one increment to see if any difference will come for the next index.

    I also found that our sitemap lists a URL - but when I search for that page in search.cgi (using words that I know are on the page and in the title), it says there are no results.
    Is the sitemap a list of pages indexed - or just links found?

    Just trying to narrow down anything that might be helpful.

    Thanks

    Leave a comment:


  • bhtech
    replied
    Hi,

    I believe it was a full re-index - there hasn't been any setting changed to make it incremental.

    We index the site in spider mode.

    The zip file I linked you to contains all the files that I upload - they all had the same date on them - so as far as I know they are all from the same index.

    I'm not sure about the zoom_wordmap.zdat file, I know we have around 25 words in our Word Skip List (small words) - could that explain that?

    In terms of the number of pages indexed, I'm not sure why that number would be so different, I may have miscalculated the original amount. Unfortunately the logs were never set up for the zoom search - however I have set them up now to log to file.

    There is nothing in my 'Status' tab though. Is that because the indexer is not running? It always shows 0 for everything.

    Thanks

    Leave a comment:


  • David
    replied
    The index data does in fact look corrupt. We can get a similar crash on our server. In your index files there are internal file pointers that point past the end of the files. Which should never happen. It might be the result of mixed files on the server, or a bug that caused the index to be corrupt when it was made.

    Are you using incremental indexing when you built this recent index or was it a full reindex?

    Are you using offline mode or spider mode when indexing your site?

    When you uploaded the files, are you sure you uploaded all the files, and not just some of the files? So some files might be from a older smaller indexing session and some files from a larger index. This would nicely explain the behaviour. In particular the zoom_wordmap.zdat file looks too small. we would have expected it to be more around 193MB, rather than 144MB.

    The index you sent us has ~161K pages. But you mentioned about 170K pages were indexed. Do you have the actual log from the indexing session by any chance?

    Leave a comment:


  • bhtech
    replied
    Awesome!

    I have gone through the logs and found the queries that are causing the server error's.

    They are: organise, children, print and encourage. (including the plurals, etc. of those words)

    These look to be the only words causing the issue - I'm not sure if that will help with finding any sort of corruption.

    Thanks again for your excellent support.

    Leave a comment:


  • David
    replied
    Got the file and having a look at it now.

    Leave a comment:


  • bhtech
    replied
    Is it best to email a link to you?

    Can any harm come from posting the link here?

    UPDATE: I have sent you a PM with the link.
    Last edited by bhtech; Jul-14-2014, 02:24 AM. Reason: Update

    Leave a comment:


  • David
    replied
    The search for Me and Me? if different because ? (like the * character) act as a wildcard search.

    Again, if you want us to take a look, Zip up the files.

    Leave a comment:


  • bhtech
    replied
    Hi,

    After having the site re-indexed yesterday, I uploaded the files to our server.

    I can now search 'dream' and 'me?' and there is no error - but I am now seeing the error if I search 'organiser' or 'children' (there may be more, these were the first I found in the logs).

    This would have to indicate a corruption in the .zdat files, wouldn't it? Something that is particular to those results that are only called when searched.

    If that is the problem, where would I start looking to fix this issue?

    Thanks

    Leave a comment:


  • bhtech
    replied
    Hi,

    I found another strange query that caused the same error. I can search 'me' fine but if I search 'me?' I get the error. I found that one very strange.

    I have uploaded the search.cgi and all zoom_xxx files since having the issue and that didn't seem to make any difference - and I did make sure everything was uploaded in binary mode.

    Although I do understand that it could be hardware/settings issue, I can't understand why only certain queries would bring on hardware/settings issues. For the 'me' query - the search page returns 114,752 results, which I imagine would be at the higher end of search results returned.

    I'm leaning more towards your mention of different search words using different sections of the index. This would make the most sense to me - there could be certain parts of the index corrupted.

    I will attempt to tar/gzip the files and get you a link. Do you just want the zoom_xxx files. (Ours are .zdat)

    Thank you very much for your prompt replies and thorough support, I really do appreciate it.

    Thanks

    Leave a comment:


  • David
    replied
    No, a crash is not normal. Regardless of the search word. But different search words will use different sections of the index and maybe different code. For example some search words will trigger stemming and spelling suggestions, while others won't.

    • Could be a bug in V6.
    • Could be index corruption
      • Using ASCII mode when doing a FTP upload causes a lot of grief.
      • Having a mix of index files from multiple indexing sessions leads to corruption

    • Could be hardware failure (e.g. bad RAM in the machine).
    • Could be corruption of the files on your server (e.g. hard disk corruption).
    • It could be settings on your server. (e.g. limits and CPU and RAM usage that force the process to the killed prematurely)


    A 1GB download is no problem for us, if you can tar/gzip the files on your server. Loading up your files on our server can eliminate some of the possibilities.

    Leave a comment:

Working...
X