PassMark Logo
Home » Forum


No announcement yet.

Spidering Javascript and DHTML

  • Filter
  • Time
  • Show
Clear All
new posts

  • Spidering Javascript and DHTML

    We've had some questions recently regarding the use of spider mode to crawl Javascript or DHTML links, and we thought we should spend a minute here to explain it, should anyone else be wondering.

    If you are not aware of this issue, the FAQ is here:

    DHTML is essentially client-side Javascript which produces dynamic HTML output. It requires a Javascript interpreter to process the script, and it also often requires user interaction to trigger certain events.

    First of all, there is no Javascript or DHTML standard. Each browser implements their own version of it, so the various attributes and properties available are always different. This makes using an external Javascript interpreter or script engine fairly futile, if it means that most scripts would not work for it.

    In addition to that, even if we had a script engine embedded into Zoom, there is no way that a spider can "guess" the user interaction required to trigger certain links to be outputted in HTML. For example, you may have to place your mouse over a menu item, before a cascading menu is formed and the links for it are produced. Another example might need you to click on a certain button, or wait 20 seconds,... the possibilities are endless.

    It is generally considered that Javascript and DHTML can not be a standard for website navigation. Due to the above issues, these elements are considered spider unfriendly and this applies to external search engines such as Google, Yahoo!, MSN, etc. It is often a rule for "search engine optimizations" to not rely on Javascript links on your site for this reason.

    In addition to this, if DHTML is the only method to access parts of your site, then your website would be inaccessible to any user with Javascript disabled, or any one using a browser which may be incompatible with your Javascript. Simply put, relying on DHTML and JS navigation will cause a myriad of accessibility problems, and there's not much we can do about it. DHTML and Javascript can really only ever be complimentary navigational methods at best, and you should always have non-Javascript dependant methods available to access the site.

    If you do not want to alter your layout too much, you can always use the <noscript> tags. This is the HTML standard to provide an alternative output only when Javascript is not supported. This would allow spiders to find the pages accordingly as well as any of users with non-supported or incompatible browsers. For more information on this, please see:
    Wrensoft Web Software
    Sydney, Australia
    Zoom Search Engine