Announcement

**David** · Jun-04-2006, 01:23 AM

If your pages don't have identical content and don't have identical URLs, they are just similar pages. Not duplicates.

A few different scripts have this problem of generating an near infinite number of different pages. Filtering on the title for uniqueness is not the best solution. Many sites don't have unique titles. So the usefulness is limited. There is also the secondary problem that filtering on the title still means a large number of unnecessary pages need to be downloaded (near infinite again for some scripts). This is because we don't know what the title of the page will be until after the page is downloaded.

We have a FAQ question that covers this issue and a better solution.
"Q. How should I index my site if it features a message board, forum, or calendar and other similarly complex scripts?"

-------
David

**nikkie** · Jun-04-2006, 12:42 PM

My vBulletin Solution

Originally posted by Wrensoft

We have a FAQ question that covers this issue and a better solution.
"Q. How should I index my site if it features a message board, forum, or calendar and other similarly complex scripts?"
David

Thanks for that - I missed it in the search!

I finally gave up on the notion of a one-size-search fits all. I've excluded the vBulletin forum. I just couldn't find a balance with my skip list. Either I got nothing, or I got infinity.

After thinking about this in greater depth, I now believe the solution is to write a custom php script that enumerates post data from the db into separate HTML pages while removing all urls within each of those pages.

Thanks again!

Announcement

Duplicate Pages Revisited

Duplicate Pages Revisited

Comment

Comment