Scenario:
Live Site: Hosts 100% of the content.
Mirror Site: Hosts 99% of the same content as Live Site. Remaining 1% are applications developed by another company (for the same client that owns the live site), that we do not have ability to host on our Mirror server.
We would like to run the indexer on the mirror site, and use relative URL's in the indexing process, so that the index files will work correctly with either the Mirror or Live sites. (Obviously, the Mirror and Live have different domain names.)
However, we want the indexer to ALSO index the remaining 1% of content that appears only on the Live site, so that this remaining content is searchable from the Live site. (It does not have to be searchable from Mirror)
I believe we can do this by simply adding those specific URLs as additional starting points in the indexing process. So our URL's to be indexed would look something like:
www.mirror.com
www.live.com/remaining-content
The mirror DOES contain an initial link to the remaining 1% of content on the live site. For example, the Mirror site contains the link www.live.com/remaining-content.
We will be using (for other reasons) the "index and follow internal links and external links" option. So, as I understand it, the indexer will pick up the first page of www.live.com/remaining-content (but not any subsequent pages) when indexing the Mirror -- www.mirror.com.
Will the indexer then pick up www.live.com/remaining-content and all sub-linked pages because this URL appears next in the list of URLs to be indexed?
Are we going to run into any conflicts since this URL (www.live.com/remaining-content)will be spidered in two different ways???
(Yeah, it would be great if the two servers matched each other completely, but that isn't an option.)
Live Site: Hosts 100% of the content.
Mirror Site: Hosts 99% of the same content as Live Site. Remaining 1% are applications developed by another company (for the same client that owns the live site), that we do not have ability to host on our Mirror server.
We would like to run the indexer on the mirror site, and use relative URL's in the indexing process, so that the index files will work correctly with either the Mirror or Live sites. (Obviously, the Mirror and Live have different domain names.)
However, we want the indexer to ALSO index the remaining 1% of content that appears only on the Live site, so that this remaining content is searchable from the Live site. (It does not have to be searchable from Mirror)
I believe we can do this by simply adding those specific URLs as additional starting points in the indexing process. So our URL's to be indexed would look something like:
www.mirror.com
www.live.com/remaining-content
The mirror DOES contain an initial link to the remaining 1% of content on the live site. For example, the Mirror site contains the link www.live.com/remaining-content.
We will be using (for other reasons) the "index and follow internal links and external links" option. So, as I understand it, the indexer will pick up the first page of www.live.com/remaining-content (but not any subsequent pages) when indexing the Mirror -- www.mirror.com.
Will the indexer then pick up www.live.com/remaining-content and all sub-linked pages because this URL appears next in the list of URLs to be indexed?
Are we going to run into any conflicts since this URL (www.live.com/remaining-content)will be spidered in two different ways???
(Yeah, it would be great if the two servers matched each other completely, but that isn't an option.)
Comment