PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

Preliminary V6 feature list overview & progress

Collapse
This topic is closed.
X
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Preliminary V6 feature list overview & progress

    Preliminary V6 feature list

    We thought you would be interested in knowing where we are up to in the development of V6 of Zoom. We are about 6 months into the development now and expect development and testing to continue for several more months yet, for a final release in 3rd qtr of 2008.

    So we are now in a position to list out the features that will definitely be in V6 and those that are still under consideration.

    Features definitely in V6
    • New, overhauled User Interface.
    • Improved index logging, featuring on-the-fly index log filtering. You can switch views to see, or hide skip messages, etc. Makes it much easier to find errors, broken links, etc.
    • More detailed indexing status window. You can see what each thread is doing at a glance.
    • User specifiable file types: You can specify how each file extension will be handled. For example, you can specify that a .JPZ file be treated as a jpeg file.
    • Native ASP.NET search script option
    • Office 2007 plugin (index document types DOCX, PPTX, .. etc.)
    • MHT plugin for indexing IE MHT web pages
    • Support for optional ZOOMTITLE and ZOOMDESCRIPTION to allow each web page to have a custom meta data that is used only by Zoom, but not by other search engines, like Google. This is useful for SEO work where you need to optimise pages in different ways for different engines.
    • Configurable truncate title length option for super long page titles.
    • Option to "Open all plugin file formats in a new window" so that you can have HTML files open in the same window, and PDF files open in a new window. At the moment in V5 all documents open in the same window, or they all open in a new window.
    • Spider image maps. The Zoom spider will now crawl image map links.
    • New spider option: URLTYPE_FOLLOW_ALL (follow all pages to one level for a start point without indexing the start page).
    • Checks for changes made to configuration, and prompts user to save config before quitting if changes have been made. This helps avoid accidentally loosing changes.
    • Check Thumbnail Exists: Option to check that a thumbnail image exists on the web server before using the link. This means avoid broken links to images that don't exist.
    • New and improved "Jump to highlighting" script which will be more compatible with other scripts and also exclude highlighting within ZOOMSTOP sections. This can be used to avoid highlighting some sections of your page like the navigation menu.
    • New, improved method of CRC duplicate page detection: the CRC comparison is now made after stripping out HTML and ZOOMSTOP sections. This means that a page with ads excluded using ZOOMSTOP will now be recognized as being duplicate, despite having different dynamic ads on the page.
    • Improved link finding methods in Spider Mode: crawl more pages. In particular there is now more chance that non HTML links that are within Javascript script tags are picked up by the spider. We would still suggest using normal HTML links on your pages where possible however.
    • Zoom will now reload the last ZCFG configuration file used by default.
    • Improved Vista support (better compatibility with UAC user account control, folder permissions, etc.). Note that we fully support Vista in the existing V5 of the software, but these V6 changes remove a few Vista quirks.
    • Improved compatibility and tolerance of antivirus software and the Windows Indexing Service. Zoom will now deal with cases where the Windows Indexing Service or other 3rd party software (like Antivirus software) is locking Zoom's files.
    • Custom Meta Fields: specify arbitrary meta fields to be indexed and made search-able. For example, index and search on a real estate website, by "Number of bedrooms", "Suburb", "Price", "Property type", etc. This is a big feature as it effectively means you can build simple custom databases with a multi criteria search using Zoom without actually having a database.
    • 64-bit edition of the Zoom indexer. This can greatly extend the available RAM and capacity of the indexer.

  • #2
    Features hopefully in V6


    This is a list of features that will probably get into V6 assuming we get time to implement them and we still think there is demand for them.
    • A more flexible template page (greater control of specific elements to be displayed and their location).
    • Boosting of entire domains
    • Tag cloud generator
    • Keyword listing generator
    • Category results summary (x results found for category A, y results for category B, etc)
    • More advanced authentication methods (e.g. configure Zoom to POST login details), in addition to the existing methods like HTTP authentication.
    • Improve cookie support (e.g. store cookies sent to spider)
    • Return x results per domain sorting option. This can be useful if you are indexing dozens of different domains into a single index.
    • Configurable maximum word length
    • Improvements to exact phrase searches
    • FastCGI support. This feature might speed up the CGI search times by 100% or 200%. But not many servers support FastCGI, and not many customers are asking for more speed in the CGI. So demand is low.
    • Restructure of the low level index format to support higher capacity (something around 5 million to 10 million pages with the 64bit edition). This sounds great in theory, but in practice there aren't too many of our customers that have even reached the 100,000 pages mark. So demand isn't high for this.
    • Stemming
    We post more details and some screen shots of what is in development over the coming weeks.

    Comment


    • #3
      Any chance of getting exact phrase searching in JavaScript mode?

      The Results per Category will be sweet!


      Leon

      Comment


      • #4
        Originally posted by MergeThis View Post
        Any chance of getting exact phrase searching in JavaScript mode?
        Unfortunately, no. Javascript is still the technically limited scripting platform that it always is, and it is not suitable for implementing such a feature. More details in an older post here
        --Ray
        Wrensoft Web Software
        Sydney, Australia
        Zoom Search Engine

        Comment


        • #5
          Assign a thumbnail based on category?

          Comment


          • #6
            I think that's on our list of things we're adding. Note that the above list is not the complete list of planned changes.
            --Ray
            Wrensoft Web Software
            Sydney, Australia
            Zoom Search Engine

            Comment


            • #7
              Actually I had another question that came up from my team. This may beyond the scope of the product but what would it take to index emails within a .pst file?

              Comment


              • #8
                We have looked at indexing several mail formats (EML for Microsoft Windows Mail, DBX for Outlook Express and PST for Outlook) but there are some limitations that make this impractical. For example, there doesn't seem to be a nice way of asking Outlook to open a specific email message from a link on a web page (as opposed to just linking to the PST file, which contains all your emails). While we will continue to look into this, at this point, we do not know whether it is technically possible to do this reasonably.

                Google Desktop provides links to emails, but they actually run a web server on your computer, and their links call their own web server - which in turn, launches Outlook. I don't think this would be a reasonable solution for our designed usage.
                --Ray
                Wrensoft Web Software
                Sydney, Australia
                Zoom Search Engine

                Comment


                • #9
                  I see sending different content types to different windows, excellent news as currently I've have to maintain a hack to the PHP code to in V5.

                  Would love to see the style elements of the search form moved out to an external .css file.

                  Ability to rollover the search log files on a daily or weekly or monthly basis would be nice, eg. specify a base name like "logfile" and have the search script then actually log to "logfile_20_04_2008.log" based on the server time.

                  I saw "A more flexible template page (greater control of specific elements to be displayed and their location)." Depends what it means but I do find the current scripts heavy on </p> and <br> elements that I edit out. Much easy to control spacing these days using CSS which we can now treat as pretty much universal.

                  With the number of Zoom configurations I maintain being able to point the configurations at a common external synonyms file would be nice rather than having to make changes in one, export the file and then import it into all the others. Similar for recommended sites.

                  Continuing the excellent work with an excellent product - wasn't on your list but maybe I should take that one as read. My requests are all 'would be nice' requests, the product and the support are already brilliant.
                  Mark Gallagher

                  Comment


                  • #10
                    Originally posted by Ray View Post
                    We have looked at indexing several mail formats (EML for Microsoft Windows Mail, DBX for Outlook Express and PST for Outlook) but there are some limitations that make this impractical. For example, there doesn't seem to be a nice way of asking Outlook to open a specific email message from a link on a web page (as opposed to just linking to the PST file, which contains all your emails). While we will continue to look into this, at this point, we do not know whether it is technically possible to do this reasonably.

                    Google Desktop provides links to emails, but they actually run a web server on your computer, and their links call their own web server - which in turn, launches Outlook. I don't think this would be a reasonable solution for our designed usage.
                    Yeah that's what I figured. I have tried saving each email as an HTML file using a converter...but when you have emails going back 10 years Zoom runs into issues with indexing too many files . We really push the limits of Zoom here but I must say it always performs very well and we love it.

                    Comment


                    • #11
                      Question regarding the propsed tag cloud generator.

                      The way I would personally like to see this implemented is to collect the meta kewords found during indexing and generate the cloud based on just those words. So meaning not taking the category or <title> into consideration. In addition it should use the .desc files as well to create the cloud. It is proably also wise to convert the meta key words to lowercase during the index process. Is this the intended implementation?

                      Thanks

                      Comment


                      • #12
                        The idea is that it would be populated with the most popular words found in the index. This includes words in <title> tags. However, this should not be a problem even if your titles are a bit "busy" for SEO purposes (e.g. every page title on our own site starts with "Wrensoft - ", even though it is unnecessary for internal purposes) because V6 supports ZOOMTITLE meta tags, which means that you can specify an alternative title that Zoom will use, and it will ignore the normal title tag when it is present.

                        But yes, meta keywords, and words indexed from .DESC files will be included.
                        --Ray
                        Wrensoft Web Software
                        Sydney, Australia
                        Zoom Search Engine

                        Comment


                        • #13
                          Support for title = &quot;something&quot; in &lt;a href&gt; tags

                          Is there any chance of including support for searching title statements in <a href> tags ?

                          This would be really useful as the title text can include keywords (eg help text) for a page that aren't necessarily contained elsewhere on that page.

                          Comment


                          • #14
                            We can probably add that. I'll put it on our list to consider.
                            --Ray
                            Wrensoft Web Software
                            Sydney, Australia
                            Zoom Search Engine

                            Comment


                            • #15
                              hello i think i would like to see that if possible:

                              1.posibility to edit title and description of searches from site that you have no control over ( i mean ability to change title and or description by you choice after indexing)

                              2.ability to add relnofollow tag

                              3.use category name as searched tag
                              thank you
                              ewa

                              Comment

                              Working...
                              X