PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

Couple of pre-sales questions

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Couple of pre-sales questions

    I would like to use Zoom as a search engine for a couple of third pary dynamic (blog-style) web sites. If I have understood the documentation correctly the search engine consists of 2 "parts", the crawler that runs from my desktop and the search-part that runs from a web server. Before I decide to buy I would like to ask a couple of questions:

    1. How "intelligent" is the crawler? Given the fact that I will be indexing about 20 - 30 dynamic sites (combined approx. 150K pages), does this mean I have to reindex every document on a regular basis or is there some sort of intelligent crawling (new documents more often, older ones less often).

    2. Can I already customize the user-agent string? I have found this older thread on the board but I believe the most recent release is still 5.0x right? Since I plan on indexing third party websites it would be nice if they knew whom to go to in case of trouble (I don't like to have my IP banned because of nasty crawlers).

    3. Can I set some kind of domain-multipliers? As far as I have read the ranking is purely text-based with some meta-values (no links or anything like that). However, if I decide that I like domain A better then domain B, can I add this preference somewhere without adding something to the actual domains (since they will be third party).

    4. How about updates? What if I decide to buy the Enterprise edition this week to experiment with. Is the 5.x release a free update, a paid upgrade or do I have to rebuy it all over...

    Thanks a lot for taking the time to answer these questions.

  • #2
    1) There is an incremental indexing option. This means that only new pages will be downloaded and indexed. Speeding up the indexing process. But being able to do an incremental index requires that the web server and scripts that create the pages, correctly return a valid update date for the pages. (otherwise the spider doesn't know what is old and what is new).

    2) Yes. V5.0 is the current release. And yes it is one of the things we are look at for V5.1

    3) The ranking takes into account many factors including; number of word occurrences, word density, URL length, weightings applied to the page, meta data, headings, link text to the page from other pages. Plus depending on the search, document update date, recommended links, wild cards and negative words can also play a part. But you can't boost an entire domain if you don't control the domain. Again, this is something we might do for a future release.

    4) Yes, all 5.x releases will be free for anyone who purchases V5.0

    Comment


    • #3
      Cool, you pretty much say it all with your last answer

      I think I'll have to buy myself a test-license

      Comment

      Working...
      X