PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

Wish List for Category Search in ver 5.0

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Wish List for Category Search in ver 5.0

    Hi,

    Here is my wish list for Category search in ver 5.0 -

    1. Facility to allow wild-cards in URL while defining categories

    E.g. -
    • Category: DOG
      Pattern: w3.mysite.com/dog/*; w3.mysite.com/info/dog*.htm
      Description: Info on Dogs


    2. Allow pages to be defined under 2 or more categories

    E.g.-
    • Category: Animals
      Pattern: w3.mysite.com/dog/*; w3.mysite.com/animals/*
      Description: Info on Animals

      Category: Pets
      Pattern: w3.mysite.com/dog/*; w3.mysite.com/cat/*
      Description: Info on Pets


    Note: the page "w3.mysite.com/dog/*" has been indexed in 2 categories, but when searching in "All" categories, this page must NOT be returned twice

    3. Allow facility to list all pages in a category
    Maybe a URL (can be a http post also) that can bring out a page that list all pages in the selected category

    4. Facility to specify the categories in the HTML
    If the category could be specified using special key words embedded in HTML as Comments it would be nice since the web page could be categorised by its creator and the page gets added in the category automatically

    5. Search in multiple categories
    Allow a user to search in 2 or more categories. Say search for the details and ask the engine to return pages under categories "Pets" and "Animals" (e.g. above)

    6. Support Sub-Categories...
    Nice but guess my wish list is asking for tooo much

    Any estimated date ver 5.0 would ship?


    have a nice day,

    Kurian Thomas
    http://www.xtendtech.com

  • #2
    It would also be nice if we could use some kind of variable to set page limits per domain - I know we can use points limits now, but that means all domains would have the same page limit - it would be great if we could set different limits for each domain.

    Comment


    • #3
      Originally posted by zamolxis
      It would also be nice if we could use some kind of variable to set page limits per domain - I know we can use points limits now, but that means all domains would have the same page limit - it would be great if we could set different limits for each domain.
      You will be able to specify a limit per start point in Version 5.0. This will allow you to set a different limit per domain.

      We are still looking into various improvements or changes that we can make for the categories feature. Thanks for the suggestions.
      --Ray
      Wrensoft Web Software
      Sydney, Australia
      Zoom Search Engine

      Comment


      • #4
        You can list all pages in a category with a URL like,
        http://www.yoursite.com/search/searc...=**&zoom_cat=X

        Where X is equal to the category number.

        ----
        David

        Comment


        • #5
          I don't know if this can be implemented, but another great feature I think it would be to be able to save partially completed spidering jobs - so pause a job, save it and next day if you wish, reload it in the program and continue from where you left it last time.

          Also it would be brilliant if merging could be added so let's say I spidered 3 websites yesterday and need to add one more today - now I think the only way to achieve that would be to add the website to the urls and respider everything again, however, if the other websites haven't chenaged at all, ther is no point in respidering everything - it would be much easier if I could only spider the additional site and "merge" it to the already spidered data.

          Comment


          • #6
            These two requests are really one and the same. In both cases you are wanting to extend an existing index without re-indexing all the content.

            Both these requests are almost entirely addressed with incremental indexing. Which we have just started to look at.

            There is really two types of incremental indexing. One would be to manually select what is added or removed from an existing index, another would be an automatic 'content refresh' where the indexer attempts to automatically determine what needs to be added, deleted or updated.

            As pointed out in the link above. This is a technically hard problem and it it sure that the solution we implement will not be perfect for everyone because everyone has slightly different requirements. But nevertheless it should improve the situation for the majority of people who have large indexes where the much of the data is static.

            ----
            David

            Comment

            Working...
            X