I know there's CRC but this really doesnt work for much, unless you exclude every part of dynamic content which in some cases is negative.
Example: /shoes/casual/sandals.html and /shoes/summer/sandals.html are the same. But CRC cannot exclude one or the other, because all the images and links to products on those pages are dynamically generated, linking to /shoes/casual/sandals/cork-soled-sandles.html and /shoes/summer/sandals/cork-soled-sandles.html
This means that unless I exclude those vital H1 product link tags with ZOOMSTOP/ZOOMRESTART (which I dont want to do as this makes up a lot of page content % on an ecommerce store, helping give zoom relevant content to spider), these pages will always duplicate in the search; but I don't want them to.
/shoes/casual/sandals.html and /shoes/summer/sandals.html both have canonical tags to /sandles.html
/shoes/casual/sandals/cork-soled-sandles.html and /shoes/summer/sandals/cork-soled-sandles.html both have canonical tags to /cork-soled-sandles.html
So with a basic 'respect canonical links' function, it would only index /sandals.html and /cork-soled-sandles.html
Instead, my search result will currently be
/cork-soled-sandles.html
/shoes/casual/sandals/cork-soled-sandles.html
/shoes/summer/sandals/cork-soled-sandles.html
/sandals.html
/shoes/casual/sandals.html
/shoes/summer/sandals.html
- six results when there should only be two.
I understand there *is* a way for me to stop this, my 'zoomstop'-ing pretty much everything, but this will ruin the natural weighting of the pages as my weighting is very content-based.
I strongly urge you to consider this request seriously, it would not be a difficult function to implement, all it's going to do is replicate the CRC function but on a much more basic level, looking for a canonical tag, checking it's not a match for the current URL, and dealing with it accordingly (skip the page or index the canonical page if it's not already found in the index).
I really hope you can do something with this idea.
Many thanks,
Jack
Example: /shoes/casual/sandals.html and /shoes/summer/sandals.html are the same. But CRC cannot exclude one or the other, because all the images and links to products on those pages are dynamically generated, linking to /shoes/casual/sandals/cork-soled-sandles.html and /shoes/summer/sandals/cork-soled-sandles.html
This means that unless I exclude those vital H1 product link tags with ZOOMSTOP/ZOOMRESTART (which I dont want to do as this makes up a lot of page content % on an ecommerce store, helping give zoom relevant content to spider), these pages will always duplicate in the search; but I don't want them to.
/shoes/casual/sandals.html and /shoes/summer/sandals.html both have canonical tags to /sandles.html
/shoes/casual/sandals/cork-soled-sandles.html and /shoes/summer/sandals/cork-soled-sandles.html both have canonical tags to /cork-soled-sandles.html
So with a basic 'respect canonical links' function, it would only index /sandals.html and /cork-soled-sandles.html
Instead, my search result will currently be
/cork-soled-sandles.html
/shoes/casual/sandals/cork-soled-sandles.html
/shoes/summer/sandals/cork-soled-sandles.html
/sandals.html
/shoes/casual/sandals.html
/shoes/summer/sandals.html
- six results when there should only be two.
I understand there *is* a way for me to stop this, my 'zoomstop'-ing pretty much everything, but this will ruin the natural weighting of the pages as my weighting is very content-based.
I strongly urge you to consider this request seriously, it would not be a difficult function to implement, all it's going to do is replicate the CRC function but on a much more basic level, looking for a canonical tag, checking it's not a match for the current URL, and dealing with it accordingly (skip the page or index the canonical page if it's not already found in the index).
I really hope you can do something with this idea.
Many thanks,
Jack
Comment