Woohoo!
Just got Zoom Search running at http://www.dadsworksheets.com and it's awesome! Thank you!
I do seem to be having some trouble getting the indexer/spider to skip certain classes of URLs. In particular, it seems buried on the site I have URLs that mention the mixed case form "DadsWorksheets.com" whereas everything elsewhere is all lower-case "dadsworksheets.com". What happens is we get search results mentioning the same page twice, for example a search for 'Multiplication Worksheets' winds up returning the same page as two distinct search results, one with the the mixed case URL variant and one with the lower case variant. You can see this right now by visiting the home page and trying the site search form on the upper-right part of the home page or just hit this link with a multiplication worksheet example.
I've added "www.DadsWorksheets.com" to the 'Skip Options' and re-spidered the entire site, but no luck. The bit out of my config file looks like this where I'm skipping this /v1/ directory and also trying to skip the mixed case URLs...
I also tried a rewrite rule, that looks like this in my config...
I even went as far as trying to add a 301 redirect in my Apache configuration to point the mixed case URL to the all-lower version, restarted Apache, re-spidered with Zoom, and still, same thing.
I did install Zoom in a subdirectory, but have the search form in the site root. However, I've been conscientious about the location of the zcfg file being in the root as well and where it writes the index files. I verified there is only one set of index files on the site so it's not something dumb like spidering one set of files but reading a different one from the search settings in some temp directory or something. I think anyway.
Running out of ideas and could use another set of eyes on this.
Thanks for your help!
Jim
Just got Zoom Search running at http://www.dadsworksheets.com and it's awesome! Thank you!
I do seem to be having some trouble getting the indexer/spider to skip certain classes of URLs. In particular, it seems buried on the site I have URLs that mention the mixed case form "DadsWorksheets.com" whereas everything elsewhere is all lower-case "dadsworksheets.com". What happens is we get search results mentioning the same page twice, for example a search for 'Multiplication Worksheets' winds up returning the same page as two distinct search results, one with the the mixed case URL variant and one with the lower case variant. You can see this right now by visiting the home page and trying the site search form on the upper-right part of the home page or just hit this link with a multiplication worksheet example.
I've added "www.DadsWorksheets.com" to the 'Skip Options' and re-spidered the entire site, but no luck. The bit out of my config file looks like this where I'm skipping this /v1/ directory and also trying to skip the mixed case URLs...
Code:
#SKIPPAGES_START www.dadsworksheets.com/v1/ www.DadsWorksheets.com #SKIPPAGES_END
I also tried a rewrite rule, that looks like this in my config...
Code:
#REWRITELINKS:1 #REWRITEFIND:DadsWorksheets #REWRITEWITH:dadsworksheets
I did install Zoom in a subdirectory, but have the search form in the site root. However, I've been conscientious about the location of the zcfg file being in the root as well and where it writes the index files. I verified there is only one set of index files on the site so it's not something dumb like spidering one set of files but reading a different one from the search settings in some temp directory or something. I think anyway.
Running out of ideas and could use another set of eyes on this.
Thanks for your help!
Jim
Comment