PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

Finding broken link errors uncovered during indexing 404

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Finding broken link errors uncovered during indexing 404

    its been a while since i made a thread here, but im back

    i noticed that today i indexed and had 10 errors, 3 of them was files bein too big to download... so no problem there..


    so i had 7 other errors, but in the indexing list there are the errors refering to the document/page that went wrong, but it doesnt show on which page that link thats faulty, is found....

    i have been reading thru all the verbosed output but i dont seem to get a logical way of seeing how it happens.....

    is there a way to report on which pages the error occured
    suppose i have a page with file downloads, one of the downloads is not there anymore, but the link is, can i get a notification that on the dowloads page the link to the download doesnt work anymore, instead of what i get now.... saying that the download doesnt work?

    saves me a lot of time looking for the right page(s)

    unless i missread anything in the output i cant find the link between the error and the page it was found on
    If i think as i thought, i will do as i did and if i do what i did i will think as i thought....

  • #2
    There are two methods,

    1) Right click on the error message and copy the URL to the clipboard. Then using the Windows search function, search a local copy of your web site files for the URL (or part of the URL)

    2) Turn on verbose mode in Zoom. Save the entire log & open it in a text editor (e.g. Notepad). If you have a line like,
    Code:
    Error downloading file http://www.wrensoft.com/missingfile.htm
    then do a search from the top of the log for the file name and you will find a lines like,
    Code:
    Scanning http://www.wrensoft.com/zoom/filelist.html
    Queued URL: http://www.wrensoft.com/zoom/otherfile.htm
    Queued URL: http://www.wrensoft.com/zoom/missingfile.htm
    Queued URL: http://www.wrensoft.com/zoom/yetanotherfile.htm
    So in this example, the broken link to missingfile.htm was on the filelist.html page.

    We are planning to look at making this process easier in V5 of Zoom.

    ----
    David

    Comment


    • #3
      thnx david for the explanation
      If i think as i thought, i will do as i did and if i do what i did i will think as i thought....

      Comment


      • #4
        Originally posted by Wrensoft
        We are planning to look at making this process easier in V5 of Zoom.
        ---
        David
        What is the timeline for the release of version 5? Can I help beta test it?

        Scott
        Austin, Texas

        Comment


        • #5
          We do not have a fixed date, but we're looking at around June / July.

          We are not currently at a stage where it is suitable for public beta testing as there are still many features to be added. But thank you for the interest. We will make beta testing available closer to the release date.
          --Ray
          Wrensoft Web Software
          Sydney, Australia
          Zoom Search Engine

          Comment


          • #6
            Just a further comment on searching your site for broken URLs that you become aware of after indexing.

            For option 2) above, only the last 5000 lines are held in the log (to keep RAM usage low). So if you have a big site some of the log lines might be overwritten. There is however a procedure for keeping the entire log file .

            Once again. We are planning on making this easier in V5.

            -------
            David

            Comment

            Working...
            X