PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

Forbidden Access Error

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Forbidden Access Error

    When indexing a web site I am seeing a single file not being indexed because the search engine says access was forbidden. It is never the same file. So far, it is always a single file. When I access these files from the web site they download without error. What might be the cause of this?

  • #2
    Can you give us a URL to a file where you see this happening? And perhaps an extract of the index log with the actual error message you see?

    There can be a multitude of reasons, depending on your website. Do you have authentication enabled? Is it HTTP or cookie based? Cookies may expire or the spider may "logout" by following a log out link (in which case, you need to update your Skip List)
    --Ray
    Wrensoft Web Software
    Sydney, Australia
    Zoom Search Engine

    Comment


    • #3
      The two times I have seen it. The file URLs were:

      http://www.chipps.com/5/Cellular.pptx

      and

      http://www.chipps.com/5/Compute%20the%20Fresnel%20Zone.docx

      When http://www.chipps.com/5/Compute%20the%20Fresnel%20Zone.docx returned the forbidden error,
      http://www.chipps.com/5/Cellular.pptx did not; and likewise when http://www.chipps.com/5/Cellular.pptx did, http://www.chipps.com/5/Compute%20the%20Fresnel%20Zone.docx did not.

      Authentication is not enabled.

      It is not cookie based.

      I am indexing now. Once it finishes I will report back.

      Comment


      • #4
        Here is the part of the index file that shows the error.

        20:44:41 - Index Thread got ready buffer for http://www.chipps.com/5/Cellular.pptx (Content-type: Office 2007 file)
        20:44:41 - [PLUGIN] Processing Office 2007 file http://www.chipps.com/5/Cellular.pptx
        20:44:41 - DL Thread #2, got URL (http://www.chipps.com/5/Compute%20the%20Fresnel%20Zone.docx) off queue
        20:44:41 - [DOWNLOAD] Downloading file http://www.chipps.com/5/Compute%20the%20Fresnel%20Zone.docx (154543 bytes)
        20:44:43 - [INDEXED] Indexing http://www.chipps.com/5/Cellular.pptx
        20:44:43 - Index Thread got ready buffer for http://www.chipps.com/5/Compare%20Sensitivity%20of%20Radios.docx (Content-type: Office 2007 file)
        20:44:43 - [PLUGIN] Processing Office 2007 file http://www.chipps.com/5/Compare%20Sensitivity%20of%20Radios.docx
        20:44:43 - DL Thread #2, got URL (http://www.chipps.com/5/Create%20a%20Wireless%20LAN.docx) off queue
        20:44:43 - [DOWNLOAD] Downloading file http://www.chipps.com/5/Create%20a%20Wireless%20LAN.docx (10977 bytes)
        20:44:43 - [INDEXED] Indexing http://www.chipps.com/5/Compare%20Sensitivity%20of%20Radios.docx
        20:44:43 - Index Thread got ready buffer for http://www.chipps.com/5/Compute%20a%20Link%20Budget.docx (Content-type: Office 2007 file)
        20:44:43 - [PLUGIN] Processing Office 2007 file http://www.chipps.com/5/Compute%20a%20Link%20Budget.docx
        20:44:43 - DL Thread #1, got URL (http://www.chipps.com/5/Demonstrate%20Polarization.docx) off queue
        20:44:43 - [DOWNLOAD] Downloading file http://www.chipps.com/5/Demonstrate%20Polarization.docx (11789 bytes)
        20:44:43 - [INDEXED] Indexing http://www.chipps.com/5/Compute%20a%20Link%20Budget.docx
        20:44:43 - Index Thread got ready buffer for http://www.chipps.com/5/Compute%20the%20Fresnel%20Zone.docx (Content-type: Office 2007 file)
        20:44:43 - [PLUGIN] Processing Office 2007 file http://www.chipps.com/5/Compute%20the%20Fresnel%20Zone.docx
        20:44:43 - [INDEXED] Indexing http://www.chipps.com/5/Compute%20the%20Fresnel%20Zone.docx
        20:44:43 - DL Thread #2, got URL (http://www.chipps.com/5/) off queue
        20:44:43 - [DOWNLOAD] Downloading file http://www.chipps.com/5/ (377 bytes)
        20:44:43 - Index Thread got ready buffer for http://www.chipps.com/5/Create%20a%20Wireless%20LAN.docx (Content-type: Office 2007 file)
        20:44:43 - [PLUGIN] Processing Office 2007 file http://www.chipps.com/5/Create%20a%20Wireless%20LAN.docx
        20:44:43 - DL Thread #1, got URL (http://www.chipps.com/5/Do%20Some%20War%20Driving.docx) off queue
        20:44:43 - [DOWNLOAD] Downloading file http://www.chipps.com/5/Do%20Some%20War%20Driving.docx (10849 bytes)
        20:44:43 - [WARNING] Could not download file: http://www.chipps.com/5/ (Forbidden)
        20:44:43 - DL Thread #2, got URL (http://www.chipps.com/5/Examine%20Wireless%20Frames.docx) off queue
        20:44:43 - [DOWNLOAD] Downloading file http://www.chipps.com/5/Examine%20Wireless%20Frames.docx (11249 bytes)
        20:44:43 - [INDEXED] Indexing http://www.chipps.com/5/Create%20a%20Wireless%20LAN.docx
        20:44:43 - Index Thread got ready buffer for http://www.chipps.com/5/Demonstrate%20Polarization.docx (Content-type: Office 2007 file)
        20:44:43 - [PLUGIN] Processing Office 2007 file http://www.chipps.com/5/Demonstrate%20Polarization.docx
        20:44:43 - [INDEXED] Indexing http://www.chipps.com/5/Demonstrate%20Polarization.docx

        Comment


        • #5
          The only error in the above log is this one:

          Originally posted by Ken Chipps View Post
          20:44:43 - [WARNING] Could not download file: http://www.chipps.com/5/ (Forbidden)
          And this seems to be a legitimate error. The URL http://www.chipps.com/5/ does indeed return a HTTP 403 Forbidden error when I access it from IE.

          Perhaps I'm missing something?
          --Ray
          Wrensoft Web Software
          Sydney, Australia
          Zoom Search Engine

          Comment


          • #6
            I am reading the error as saying it could not download and index the file listed just above the error line. In other words in this example, the file at http://www.chipps.com/5/Do%20Some%20War%20Driving.docx. Which you can download. The other point is every time I index I get this same error, but with a different file name listed just above the error line. Do you read the error line as saying it could not access a link named

            http://www.chipps.com/5

            If so, there is no such link on the site.

            Comment


            • #7
              No, the error is telling you that it was unable to download the URL specified on the same line of that error, that is http://www.chipps.com/5/

              Are you sure there is no such link? My guess is that there is a broken link somewhere on your site. You can track this down by analysing the index log. Turn on "Queued" messages in the Configuration window (by checking the option for "Spider" related message types on the "Index Log" tab). Run the indexer in single thread mode (select this on the "General" tab of the Config window). Now when you index your site, you will see (in a linear progression) where the link was found.

              I'm doing this just now, and I've located the link on your site. It is on this page: http://www.chipps.com/wirelesslab.html

              View the HTML source code on that page:

              <td><h5><a href="../../5/Demonstrate Polarization.docx">Demonstrate Polarization</a><a href="../../5/"></a></h5></td>
              Immediately after the "Demonstrate Polarization" link, there is an empty link created which points to the "5" folder two levels above, and thus, http://www.chipps.com/5/
              --Ray
              Wrensoft Web Software
              Sydney, Australia
              Zoom Search Engine

              Comment


              • #8
                I see. I'll check the code then. It does not showup in the GUI view. Thanks.

                Comment


                • #9
                  You're right. That was the problem. It is not important, but I wonder why the error showed up after a different file each time?

                  By the way, you should handle all of the support questions. David Wren is much to grumpy for this type of work.

                  Comment


                  • #10
                    Originally posted by Ken Chipps View Post
                    You're right. That was the problem. It is not important, but I wonder why the error showed up after a different file each time?
                    This is because you have multiple threads downloading at the same time. So there is a timing thing, where each time you index, files are downloaded at different speeds, and the order of when files are indexed/downloaded changes due to variations in the speed of your network connection.
                    --Ray
                    Wrensoft Web Software
                    Sydney, Australia
                    Zoom Search Engine

                    Comment

                    Working...
                    X