PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

ZOOMSTOP Tag Question & other questions

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • ZOOMSTOP Tag Question & other questions

    Urgent

    Hi There, I am using asp.net for creating page (using many web controls in that page). And to prevent my menu from being index, but the link in that page must be crawled. So I use <!--ZOOMSTOP--> tag at the first line of the page, and also <!--ZOOMRESTART--> in the end of the page. I wish that this page will not be displayed in search result, but the link inside this page must be crawled no matter what.

    But I find that the page (default.aspx) is displayed in the search result, although I have used the tag to stop the whole page, is there anyone know why? Is using asp.net (<%@ microsoft tag) can prevent the zoomstop functionality? Or is there any rule of using zoomstop tag (I only know that it must be ended with ZoomRestart tag)?

    Thanks

  • #2
    The stop and restart tags need to appear in that order and in pairs. Otherwise they should always work. Regardless of the other tags on the page.

    What is the URL to the page in question and for what search word does the page in question appear in the search results?

    Comment


    • #3
      for example I have this pake

      BrowseBookBySubject.aspx
      BookDetail.aspx?bookID=001

      I cannot put BrowseBookBySubject in skip options (zoom), because it later on will not crawl the BookDetail because the only link to BookDetail from home is via BrowseBookBySubject.

      So everytime I search a book title, the BrowseBookBySubject appear in search result, because I don't want the browse page (BrowseBookBySubject) appear in search result, I need only the BookDetail.aspx?id=001 (etc).

      I have put the ZoomStop and ZoomRestart tag in the first line of BrowseBookBySubject and end line of the file, so I am sure that not any words in the file will be indexed and only the links contained in the file will be process. But still when I search the book title (which appear in Title tag of BookDetail), I can still see the BrowseBookBySubject exist in search result, am I assuming that if I enclosed the whole file with ZoomStop and ZoomRestart tag so that no word in the file will be indexed is correct? And so the file will not appear in search result?

      Comment


      • #4
        I think the most likely explanations are,
        1) You didn't use the tags correctly. What is the URL to the page?
        2) The old page is being cached. You can turn of caching from the "general" tab in Zoom.
        3) You didn't upload a new set of index files after re-indexing the site with the tags added.

        am I assuming that if I enclosed the whole file with ZoomStop and ZoomRestart tag so that no word in the file will be indexed is correct?
        Almost correct. You might get one of two words indexed. e.g the name of the file (BrowseBookBySubject) and maybe some link text from another pages that link to the BrowseBookBySubject page. But not the content of the page.

        Comment


        • #5
          @wrensoft
          for number 1 & 3, we already check for it, I think we need to try reload all files, so no cache is used. I will try that later on.

          thank you

          Comment


          • #6
            Hi, i have solved the ZOOMSTOP and ZOOMRESTART tag problems and so currently it is working quite right by checking Reload All Files (do not use cache) option in General tab.

            But there is something very strange happen now, the title of my page is not displayed in search result, it displays "No Title" but the actual page has title if I click on it? Is there anything I miss out?

            FYI, the page name is detailbook.aspx?idbook=001, and it depends on the idbook so that the title of the page is changing based on what book the user is viewing, so is this can cause problems so that No Title is displayed in search result? But in the earlier setting, before I checked Reload All Files, the title of each book is displayed correctly.

            Help me

            Comment


            • #7
              Check the HTML. Perhaps your <title> tag is being specified outside of the <head>... </head> ? If so, this is invalid HTML and is not picked up by Zoom. I think IE might pick it up, though this is non-standard behaviour. If you change the HTML so that the <title> is within the head section of the page, it should work fine.
              --Ray
              Wrensoft Web Software
              Sydney, Australia
              Zoom Search Engine

              Comment


              • #8
                Code:
                <!--ZOOMSTOP-->
                
                <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" >
                <HTML>
                	<HEAD>
                		<title>Modelling and forecasting the diffusion of innovation – A 25-year review  </title>
                above is example of the result, the title tag is inside head, and head is inside html

                FYI, i use PreRender method in ASP.NET (page pre render) to set up the title, so the title is set inside this method by looking up the idbook to database and put the title inside the title tag, which I use a <asp:literal tag

                can anyone help?

                Comment


                • #9
                  Your ZOOMSTOP tag prevents the <title> from being picked up. ZOOMSTOP is used to exclude any part of the page from being indexed, and in this instance, you are excluding the header and the title as well.
                  --Ray
                  Wrensoft Web Software
                  Sydney, Australia
                  Zoom Search Engine

                  Comment


                  • #10
                    OK Thanks,
                    Next Question

                    1. I have tried inserting two sites (http://localhost/library/books and http://localhost/library/journals) inside the Advance spider URL Options setting (the more button in spider mode). But in my case is I try to search and the result appears to be only links come from the first site, I have already changed the keyword for journals (the second site) only keyword and the result is empty. So I guess that the crawl stopped while in the first site. Is there anything I miss out and should notice? Like limits and so on?

                    2. Assume that the index stops at the first site because it has reached the max in Limits configuration, can I change this value and then uses Incremental update to update? Or should I reindex the whole site after changing any Limits configuration?

                    3. For Example my first indexed stopped at 5000 files and the Limits Configuration allowed 1000000. And I uses incremental updates in the next indexing process. Is the next crawling starts from 0 limits again or from 5000? So I mean is this limit, a limitation for each index process or a sum up from previous indexed result?

                    Thanks
                    Last edited by innosia; Feb-20-2008, 08:24 AM.

                    Comment


                    • #11
                      1.) Check your index log. That's the yellow/green/red messages that appear as you are indexing. You can save it to file if it's too long to browse through.

                      If indexing did stop during the first start point, you will be given a reason and error as to why in the index log (e.g. the maximum page limit was already exceeded by the first start point and it never got to the second).

                      2.) You should re-index your whole site. Since you are indexing localhost (so traffic and download is not a concern), there's little benefit in incremental indexing anyway.

                      3.) No, the limits apply to the total data stored in the index. It does not reset per incremental update.
                      --Ray
                      Wrensoft Web Software
                      Sydney, Australia
                      Zoom Search Engine

                      Comment


                      • #12
                        Another question
                        1. I have start an indexing, and it is too long to wait, so I stop it, and the zdat appears. And then next, by doing this can I use incremental indexing assuming that it will index the file that it is not yet indexed in the first step?

                        2. If number 1 answer is no, so I assume that to use incremental index I must have already indexed the whole site without stopping it while in the middle of the process. So I must add start points 1, then index till stop, then I add start points 2, then I do incremental index, then I add start points 3, then I do incremental index again, so it is very bothersome. Can I just add the all start points, and then do spider index, and stop it, then next time uses incremental index? Because my client's site is already running, and it is being used in daytime so I cannot index the site, so I need to wait till night, and the data is quite big, indexing it a whole night is not enough.

                        Thanks

                        Comment


                        • #13
                          Hitting the stop button before the end of indexing will still result in the generation of a valid set of index files. These files can be used for search or can be incrementally added to.

                          Comment


                          • #14
                            However, note that it does not "resume" your previously stopped indexing session.

                            For example, if you use "Incremental Update", new pages will only be added if one or more of the pages already indexed have changed.

                            So let's say you have 10 pages out of 100 indexed, and then you hit Stop. Now if you perform an "Incremental Update", Zoom will check the 10 pages to see if they have changed. If one or more of them has changed, it will re-index those files, and if they have links to pages not yet indexed, it will also follow those links and index more pages.

                            If however, the first 10 pages did not change, then it will not index any new pages at all.

                            Originally posted by innosia View Post
                            So I must add start points 1, then index till stop, then I add start points 2, then I do incremental index, then I add start points 3, then I do incremental index again, so it is very bothersome.
                            You can add multiple start points in one go with "Incremental - Add start points to existing index". There is no need to add them one at a time.
                            --Ray
                            Wrensoft Web Software
                            Sydney, Australia
                            Zoom Search Engine

                            Comment

                            Working...
                            X