  • 406 Errors

    I am unable to get a complete index of our website. Even after reducing threads to 1 and increasing throttling to 15 seconds, I typically get two 406 errors (No acceptable response available) on each pass. If I log in to the website using the login used by ZoomIndexer, I have no problem accessing any of the files / URLs shown with errors. Furthermore, the errors are attributed to different PDF files in each indexing attempt. How can I get a complete index?
    I looked for a retry setting in the User Guide and in the help, but did not find any such feature.
    I have tried both build 1004 and build 1008 with the same results.
    P.S. Errors like this have appeared intermittently in the past 4 months, but I have been able to resolve them by using single-threading and throttling. Unfortunately, I am now at the end of the line for this tactic.

  • #2
    One page request per 15 seconds is a very low level of load. A modern web server should do something around 200 requests per second.
    Also a 406 error is not the typically error you get from a web server when it is overloaded. 406 is nearly never used in real life.
    So further reducing the load is not the solution.

    Instead you need to investigate the root cause. Start by having a look at the web server's error log. There might be additional detail in the log.

    What type of hosting package are you using? Who is it with and what type of package is it (e.g. VPS, dedicated hosting, cheap shared hosting)?


    • #3
      Thanks for responding. I haven't been back here in a while as I wasn't notified of your response. The platform we are using is Wild Apricot, and I have no access to logs.
      I see that meanwhile several new Zoom Search versions have been released. I will try again with build 1011.
      P.S: I found out haw to get e-mail notifications by editing my profile settings.


      • #4
        So very likely Wild Apricot (what kind of name is that? ) are deliberately blocking you from downloading all the pages on the site.
        Could be a form of lock-in to stop you moving the content elsewhere, or a form of load control to squeeze the maximum number of sites onto a physical machine. (So any popular site won't work on their service if that was the case).


        • #5
          Once again I failed to get an e-mail notification. I have (re-)subscribed to this thread (and all of my other threads) and hope to get e-mail notifications in the future.
          I am still having the sporadic 406 errors with build 1011.
          Wild Apricot is indeed a peculiar name; I have no idea how they came by it. However, it is the most reliable and feature-rich club membership platform we have found for a halfway affordable price. Given how much I have slowed down the capture, I doubt that load control is responsible. Furthermore Wild Apricot hosts much larger sites, which are used by much larger populations than ours. Nonetheless, I will open a ticket with to investigate. Meanwhile I have asked Ray whether there is some way to capture the server response in order to identify the mismatch between the requested type and what the server wants to return.


          • #6
            Wireshark is the best tool for capturing network traffic.