PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

To many requests

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • To many requests

    Hi,

    I did get a complain from my internet provider that one site hade received 150000 requests. This had filled up some memory on their server, resulting in that this site was automatacally shut down. I have read on the forum that one can slove down the spidering. But will that help. The hit per second doesn't seem to have been the problem, rather the amount of hits. I run from one PC over not so fast broadband. Used two threads, minimum delay and "Reload all files" (realize that this is not so good)
    What can be done to avoid this problem? Is it enough too slow down the spidering?

  • #2
    I did get a complain from my Internet provider
    Was this from your ISP, or was this from your hosting company? I assume the hosting company?

    150,000 hits is fairly small. No hosting company should complain about such a thing. Our server gets about 300,000 hits per day (about 15GB of data per day). This happens every day of the week.

    Did you sign a contract with them saying your site was were only allowed X visitors per day? Or do you have a contract that specifies a maximum about a data per day? Or is it one of these decietful hosting companies that promise 'unlimited' data, but then shut your site down if it becomes busy?

    Also, getting a lot of requests over a period of time should never "use up some memory on their server". There is probably a problem on the server if this is really happening.

    What can be done to avoid this problem?
    Yes you can slow down indexing. But this is not a technical issue. It is really a contractual issue. We can't tell you what your hosting company will allow or disallow.

    What I can tell you however is that it is time to get a new hosting company.

    Comment


    • #3
      Isp

      It's my ISP complaining, or rather a website owner that has complaind to my ISP. Perhaps that website has an insufficient hosting. I'am indexing websites with their own servers and some with free hostings.

      How to best avoid any more complains (DoS)? Will slowing down help?

      I have a logfile (on paper), is there anyway one can find out witch site that couldn't handle the hits so I can exclude them?

      Comment


      • #4
        So this isn't your own web site that you are indexing?

        If you are indexing external sites then you do need to be polite. I would suggest,

        1) Make sure you enable robots.txt support. (On the Scan options tab of the Zoom configuration window). The default is to have this on.

        2) Refer any web site (or ISP) that complains to this page about robots.txt and user agent strings. In particaular the part about "Crawl-delay". This allows web site owners to control the indexing speed. So then you wouldn't need to worry any more.

        3) If the web site owner is too lazy to make a robots.txt file, then you should probalby still do the right thing and reduce the indexng speed for this site.

        4) Use incremental indexing where possible. For many servers this will put less load on the server.

        If the person complaining didn't identify their site, nor supply their contact details, then it would seem slightly unreasonable to expect you to take much action. (We can't tell you who it was that contacted your ISP).

        Comment


        • #5
          More information

          I did get some more information from my ISP. They said that it was the type of requests that is the problem. The requests make that server create new sessions for each requests. Then the server get "out of memory exeption". I don't understand this but perhaps it can be interesting for you.

          My ISP informed my wich URL that had problem whith the requests so I exclude that one.

          Comment


          • #6
            Many web sites do not use sessions. For those that do it is a decision of the application running on the web site to create a new session, or not, for each request. Typically a session is allocated after a user as logged into a site. There is really only one type of request that is used by Zoom. Which is a HTTP GET request.

            It is the responsibility of the applications running on the web server to manage the sessions it creates. The web application should expire the session if the server is short of RAM. If the server can be crashed as a result of the traffic from 1 user then there is something wrong with the server (and or the applications running on the server).

            Comment

            Working...