PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

Questions about SDK update and dictionary sort order

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Questions about SDK update and dictionary sort order

    Hi,

    First, we like the Zoom Indexer and associated products. Ever since purchasing the SDK, we have been very happy with the results we have gotten.

    We would like to upgrade to the newest indexer and incorporate the newest changes/fixes into our builds from the SDK. Because of the SDK, we are still using indexer 5.0.1004. I would like to ask for a refresh of the SDK to the newest CGI source. Is this possible?

    Thanks.

    Also, we have been looking at the .zdat files quite a bit and we really like the effeciency and speed. We have noticed one thing though - the zoom_dictionary.zdat file seems to be a list. When looking for long search terms in a dictionary of almost a million words, the search tends to be a little slow. Since we are a few indexer versions behind my question might already be solved, which is:

    Do you have plans on changing the zoom_dictionary.zdat to be more of a search tree instead of a list? This would increase searching tremendously, even if the search tree was done by simple alphabetization.

    Once again, thanks.

    D

    P.S. Since we have the SDK, I don't know how much I should reveal.

  • #2
    We don't have any automated way of sending our new versions of the SDK code. So you just need to E-mail us for the moment. We make free SDK updates available for 12 months after a purchase.

    There is no easy solution to the dictionary issue. It would take me half a page to explain why, but in brief, if the words appeared in a different order (e.g. sorted), then the search context would end up a garbled mess.

    However there are complex solutions that would allow a sorted dictionary, but they involve performance & complexity trade offs during the indexing process.

    This would increase searching tremendously...

    In our benchmarking and code profiling we determined that most of search time was not spent doing a linear search in the dictionary. So the gain is not going to be enormous in most cases. Most of the time was wasted doing disk access in most scenarios.

    So an easy solution to get more speed would be to drop the entire set of index files on a high speed RAM disk.

    Another (not quite as easy) solution is to use FastCGI to keep the search code resident in RAM and then we could also keep the dictionary resident in RAM. This requires code changes in the CGI however and some server reconfiguration.

    Comment


    • #3
      I have sent you the SDK for V5.1 build 1001.

      Comment

      Working...
      X