We're working on indexing travel websites. The biggest problem is simply the sheer number of city/town names (hundreds of thousands) around the world. Each city name is accordingly a unique word to be indexed in addition to other terms.
It appears we might be best off if we created multiple indexes for different geographical regions (Europe, Asia, etc.) or countries. Or should we consider those divisions to be categories?
To help limit the number of sites, we will likely limit URLs scanned to their top page except for sites we feel are excellent and would manually allow them in to a certain depth. We are also using +/- filtering for travel terms.
Almost forgot that the above are all external sites but we'll also want to merge in our own site's pages (which we will weigh heavier).
Any suggestions/tips would be appreciated.
PS: We are using the current Zoom Search Enterprise addition but keeping in our minds to use MasterNode in the future.
It appears we might be best off if we created multiple indexes for different geographical regions (Europe, Asia, etc.) or countries. Or should we consider those divisions to be categories?
To help limit the number of sites, we will likely limit URLs scanned to their top page except for sites we feel are excellent and would manually allow them in to a certain depth. We are also using +/- filtering for travel terms.
Almost forgot that the above are all external sites but we'll also want to merge in our own site's pages (which we will weigh heavier).
Any suggestions/tips would be appreciated.
PS: We are using the current Zoom Search Enterprise addition but keeping in our minds to use MasterNode in the future.
Comment