When indexing a web site I am seeing a single file not being indexed because the search engine says access was forbidden. It is never the same file. So far, it is always a single file. When I access these files from the web site they download without error. What might be the cause of this?
Announcement
Collapse
No announcement yet.
Forbidden Access Error
Collapse
X
-
Can you give us a URL to a file where you see this happening? And perhaps an extract of the index log with the actual error message you see?
There can be a multitude of reasons, depending on your website. Do you have authentication enabled? Is it HTTP or cookie based? Cookies may expire or the spider may "logout" by following a log out link (in which case, you need to update your Skip List)
-
The two times I have seen it. The file URLs were:
http://www.chipps.com/5/Cellular.pptx
and
http://www.chipps.com/5/Compute%20the%20Fresnel%20Zone.docx
When http://www.chipps.com/5/Compute%20the%20Fresnel%20Zone.docx returned the forbidden error,
http://www.chipps.com/5/Cellular.pptx did not; and likewise when http://www.chipps.com/5/Cellular.pptx did, http://www.chipps.com/5/Compute%20the%20Fresnel%20Zone.docx did not.
Authentication is not enabled.
It is not cookie based.
I am indexing now. Once it finishes I will report back.
Comment
-
Here is the part of the index file that shows the error.
20:44:41 - Index Thread got ready buffer for http://www.chipps.com/5/Cellular.pptx (Content-type: Office 2007 file)
20:44:41 - [PLUGIN] Processing Office 2007 file http://www.chipps.com/5/Cellular.pptx
20:44:41 - DL Thread #2, got URL (http://www.chipps.com/5/Compute%20the%20Fresnel%20Zone.docx) off queue
20:44:41 - [DOWNLOAD] Downloading file http://www.chipps.com/5/Compute%20the%20Fresnel%20Zone.docx (154543 bytes)
20:44:43 - [INDEXED] Indexing http://www.chipps.com/5/Cellular.pptx
20:44:43 - Index Thread got ready buffer for http://www.chipps.com/5/Compare%20Sensitivity%20of%20Radios.docx (Content-type: Office 2007 file)
20:44:43 - [PLUGIN] Processing Office 2007 file http://www.chipps.com/5/Compare%20Sensitivity%20of%20Radios.docx
20:44:43 - DL Thread #2, got URL (http://www.chipps.com/5/Create%20a%20Wireless%20LAN.docx) off queue
20:44:43 - [DOWNLOAD] Downloading file http://www.chipps.com/5/Create%20a%20Wireless%20LAN.docx (10977 bytes)
20:44:43 - [INDEXED] Indexing http://www.chipps.com/5/Compare%20Sensitivity%20of%20Radios.docx
20:44:43 - Index Thread got ready buffer for http://www.chipps.com/5/Compute%20a%20Link%20Budget.docx (Content-type: Office 2007 file)
20:44:43 - [PLUGIN] Processing Office 2007 file http://www.chipps.com/5/Compute%20a%20Link%20Budget.docx
20:44:43 - DL Thread #1, got URL (http://www.chipps.com/5/Demonstrate%20Polarization.docx) off queue
20:44:43 - [DOWNLOAD] Downloading file http://www.chipps.com/5/Demonstrate%20Polarization.docx (11789 bytes)
20:44:43 - [INDEXED] Indexing http://www.chipps.com/5/Compute%20a%20Link%20Budget.docx
20:44:43 - Index Thread got ready buffer for http://www.chipps.com/5/Compute%20the%20Fresnel%20Zone.docx (Content-type: Office 2007 file)
20:44:43 - [PLUGIN] Processing Office 2007 file http://www.chipps.com/5/Compute%20the%20Fresnel%20Zone.docx
20:44:43 - [INDEXED] Indexing http://www.chipps.com/5/Compute%20the%20Fresnel%20Zone.docx
20:44:43 - DL Thread #2, got URL (http://www.chipps.com/5/) off queue
20:44:43 - [DOWNLOAD] Downloading file http://www.chipps.com/5/ (377 bytes)
20:44:43 - Index Thread got ready buffer for http://www.chipps.com/5/Create%20a%20Wireless%20LAN.docx (Content-type: Office 2007 file)
20:44:43 - [PLUGIN] Processing Office 2007 file http://www.chipps.com/5/Create%20a%20Wireless%20LAN.docx
20:44:43 - DL Thread #1, got URL (http://www.chipps.com/5/Do%20Some%20War%20Driving.docx) off queue
20:44:43 - [DOWNLOAD] Downloading file http://www.chipps.com/5/Do%20Some%20War%20Driving.docx (10849 bytes)
20:44:43 - [WARNING] Could not download file: http://www.chipps.com/5/ (Forbidden)
20:44:43 - DL Thread #2, got URL (http://www.chipps.com/5/Examine%20Wireless%20Frames.docx) off queue
20:44:43 - [DOWNLOAD] Downloading file http://www.chipps.com/5/Examine%20Wireless%20Frames.docx (11249 bytes)
20:44:43 - [INDEXED] Indexing http://www.chipps.com/5/Create%20a%20Wireless%20LAN.docx
20:44:43 - Index Thread got ready buffer for http://www.chipps.com/5/Demonstrate%20Polarization.docx (Content-type: Office 2007 file)
20:44:43 - [PLUGIN] Processing Office 2007 file http://www.chipps.com/5/Demonstrate%20Polarization.docx
20:44:43 - [INDEXED] Indexing http://www.chipps.com/5/Demonstrate%20Polarization.docx
Comment
-
The only error in the above log is this one:
Originally posted by Ken Chipps View Post20:44:43 - [WARNING] Could not download file: http://www.chipps.com/5/ (Forbidden)
Perhaps I'm missing something?
Comment
-
I am reading the error as saying it could not download and index the file listed just above the error line. In other words in this example, the file at http://www.chipps.com/5/Do%20Some%20War%20Driving.docx. Which you can download. The other point is every time I index I get this same error, but with a different file name listed just above the error line. Do you read the error line as saying it could not access a link named
http://www.chipps.com/5
If so, there is no such link on the site.
Comment
-
No, the error is telling you that it was unable to download the URL specified on the same line of that error, that is http://www.chipps.com/5/
Are you sure there is no such link? My guess is that there is a broken link somewhere on your site. You can track this down by analysing the index log. Turn on "Queued" messages in the Configuration window (by checking the option for "Spider" related message types on the "Index Log" tab). Run the indexer in single thread mode (select this on the "General" tab of the Config window). Now when you index your site, you will see (in a linear progression) where the link was found.
I'm doing this just now, and I've located the link on your site. It is on this page: http://www.chipps.com/wirelesslab.html
View the HTML source code on that page:
<td><h5><a href="../../5/Demonstrate Polarization.docx">Demonstrate Polarization</a><a href="../../5/"></a></h5></td>
Comment
-
Originally posted by Ken Chipps View PostYou're right. That was the problem. It is not important, but I wonder why the error showed up after a different file each time?
Comment
Comment