PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

How to not index pages with a query string

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to not index pages with a query string

    I have robots.txt support enabled and this in my robots.txt file
    Code:
    User-agent: *
    Disallow: /*?
    But Zoom still indexes pages with query strings. How do I set Zoom up so it ignores all pages with a query string.

    Example:
    /product.php -> Good
    /product.php?Currency=USD -> Bad/Don't index

    PS: I did a search but it came up with nothing.
    CP

  • #2
    Wildcard characters in the Disallow field is not standard syntax for "robots.txt" files and it is generally not recommended usage. Google added support for this, but it's not really part of the standard, and it's an unstable extension.

    From the wikipedia article:
    The first version of the Robot Exclusion standard does not mention anything about the "*" character in the Disallow: statement. Modern crawlers like Googlebot and Slurp recognize strings containing "*", while MSNbot and Teoma interpret it in different ways.
    If you want to skip the file in your example, you can use this:

    Disallow: /product.php?
    Which would skip all query strings for the product.php page.

    If you want to skip ALL links that contain a query string (regardless of page), you can do so by adding a Skip Option in the Zoom Configuration window. Simply add an entry under "Page and folder skip list" as "?" (by itself), and you will skip any URLs containing a question mark.
    --Ray
    Wrensoft Web Software
    Sydney, Australia
    Zoom Search Engine

    Comment

    Working...
    X