PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

Skipping a page, but still follow links

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Skipping a page, but still follow links

    Is this possible?

    I have a index-page and I need zoom to follow the links of this page, but can I exclude this index-page from apearing as result?

  • #2
    Yes. If the page is a start URL, you can click on "More", select the URL, and click "Edit". Here you can change the spider option from "Index and follow links" to "Follow links only".

    Alternatively, if you need to specify this for multiple pages, you can use the and tags to skip indexing (but allow following of links). You can also use these tags to exclude only sections of a page as opposed to an entire page. See the Users Guide for details:
    http://www.wrensoft.com/zoom/usersguide.html
    --Ray
    Wrensoft Web Software
    Sydney, Australia
    Zoom Search Engine

    Comment


    • #3
      zoomstop exclusions

      will zoomstop exclude the entire url from the index and still follow the links, or just the surrounded text and still list the url in search results?

      Thanks,

      Jeff

      Comment


      • #4
        Zoomstop only excludes part of a page. It does not exclude the URL.

        ---
        David

        Comment


        • #5
          hmmm. I am using a dynamic .asp directory browser script and pointing the zoom spider to that. I have set the script with right after the first <html> tag and right before the closing </html> tag. So now I get completely blank pages included in my index for each page generated by the script as the spider follows the pages it generates.

          How can I exclude these URL's after indexing without doing it manually. I can create a category filter to capture them, but then I cannot hide the category in the category list on the search page (without editing it manually).


          Does any of this make sense, and any ideas?

          Thank you so much

          Comment


          • #6
            more info for the above post:

            Pro version 4.2 build 1006 , here are the sample search results from the above description. The results are for the keyword "letters". "Letters" only appears in the URL of the page .....\letters\. is the spider indexing the URL for keywords as I have ALL text between zoomstop tags? or is there something I cannot see





            any ideas? Thank you

            Comment


            • #7
              I can't see the content of the page in question. I think you have it password protected.

              However think there are a few of solutions.

              1) Turn off the indexing of file names in Zoom from the "Indexing options" tab.
              2) Change the spider settings in Zoom. You can change the spider settings from the "More" window in Zoom.
              3) Add some meaningful content on to the page.

              Here are the spider settings that you can select per start point.

              · Index page and follow internal links: index the content of the specified page and follow any internal links (links to pages beginning with the base URL) found.

              · Index page and follow internal and external links: index the content of the specified page and follow any internal and external links. However, external links are only followed up to one level. For example, an external page linked from an internal page is scanned, but an external page linked from an external page is not.

              · Index single page only: index the content of the specified page and not follow any of the links found on it.

              · Follow links only: only follow the links found on this page but will not index any of the page content. (this is probably what you want).

              ---
              David

              Comment


              • #8
                David:

                Thanks for your help!

                Sorry to keep going in circles on this, but nothing seems to work.

                Unfortunately, I think what I (and a few other posters) really need is this:

                A 5th Additional Spider URL setting of: "Follow Links Only;Don't include page in index"

                or

                Tags (spider would follow links, but not include page in index)

                or

                Something like a Skiplist filter to exclude URL's based on some pattern after indexing. Similar to the existing skiplist, but with options for excluding URL's "pre" or "post" indexing for each pattern

                Maybe these ideas are worth considering in the next version

                I guess my only current solution is to edit the .zdat files directly to delete these unwanted page links.

                -Jeff

                Comment


                • #9
                  but nothing seems to work
                  Did you turn off the indexing of file names in Zoom from the "Indexing options" tab?

                  Did you even try using the 'Follow links only' option before saying it won't work?

                  I think what I (and a few other posters) really need is...
                  We are not aware of any other user for which the suggested solutions were not suitable.

                  Editing the .zdat files directly will often cause expected and random crashes and incorrect search results.

                  ---
                  David

                  Comment


                  • #10
                    David: thanks again for your help

                    I switched to offline mode, and all my problems dissapeared.
                    I am indexing documents only, so following links on web pages was'nt necessary after all.

                    Over all, your product is simply amazing!

                    Thanks,

                    Jeff

                    Comment

                    Working...
                    X