PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

Handling iframes

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Handling iframes

    Zoom v5.1 (Build 1017)
    NetObjectsFusion v10
    CushyCMS

    I have a client site that has a working Zoom search that was based on a flat structure but I have just converted it so that the client can edit the content via CushCMS and to do this the content is pulled in via iFrames on every page. These content pages are held in a different folder to the main site folder. E.G.:-

    htdocs/html (for main site)
    htdocs/html-client (for client edited content)

    Two questions:
    1. If I run Zoom against the new site will the results and their links take the visitor to the /html-client thus losing the surrounding page and navigation?

    2. If this is the case how can I modify Zoom parameters to point to the correct pages?

    e.g: content searched will be found in /html-client/news.html but I need the search results to link to /html/news.html.

    Or don't I need to worry as Zoom is so brilliant!

    Paul
    Last edited by Adendum; Feb-12-2010, 04:55 PM. Reason: Typo

  • #2
    Hard to know for sure without looking at the site (or just trying it out).

    When Zoom (in Spider Mode only) indexes a page within an IFRAME, it will index the content within the IFRAME, but it will point the link to the page containing the IFRAME.

    Comment


    • #3
      Hi,

      The site in question is aaf-eu.org. At the moment it still has the old Zoom files in place (most of that works but the PDFs have moved so links to those all fail). Thus the need to re-index as soon as possible.

      I have overcome the problem of the results page by adding some javascript into the client pages that force the originating page to be shown. So in theory that shouldn't be a problem now.

      Just need to get Zoom back up and running

      Here is the last spider log:-

      15:45:36 - Start indexing (spider mode) at Sat Feb 13 15:45:36 2010
      15:45:36 - Maximum number of words: 50000
      15:45:36 - Maximum number of files: 1000
      15:45:36 - Will scan files with extensions
      15:45:36 - .htm
      15:45:36 - .html
      15:45:36 - .cfm
      15:45:36 - .txt
      15:45:36 - .pdf
      15:45:36 - Spider from: http://www.aac-eu.org/index.html
      15:45:36 - Web site URL: http://www.aac-eu.org/
      15:45:36 - Estimated RAM required during index process: 30881 KB
      15:45:36 - Initiating HTTP session (thread #1) ...
      15:45:36 - [DOWNLOAD] Downloading file http://www.aac-eu.org/index.html
      15:45:36 - Initiating HTTP session (thread #2) ...
      15:45:36 - [INDEXED] Indexing http://www.aac-eu.org/index.html
      15:45:36 - [DOWNLOAD] Downloading file http://www.aac-eu.org/html/the_aaf.html
      15:45:36 - [DOWNLOAD] Downloading file http://www.aac-eu.org/html/contact.html
      15:45:36 - [DOWNLOAD] Downloading file http://www.aac-eu.org/html/viewpoint.html
      15:45:36 - [DOWNLOAD] Downloading file http://www.aac-eu.org/html/markets.html
      15:45:36 - [INDEXED] Indexing http://www.aac-eu.org/html/the_aaf.html
      15:45:36 - [INDEXED] Indexing http://www.aac-eu.org/html/contact.html
      15:45:36 - [INDEXED] Indexing http://www.aac-eu.org/html/viewpoint.html
      15:45:36 - [DOWNLOAD] Downloading file http://www.aac-eu.org/html/starch.html
      15:45:36 - [DOWNLOAD] Downloading file http://www.aac-eu.org/html/aaf_news.html
      15:45:36 - [INDEXED] Indexing http://www.aac-eu.org/html/starch.html
      15:45:36 - [INDEXED] Indexing http://www.aac-eu.org/html/aaf_news.html
      15:45:36 - [INDEXED] Indexing http://www.aac-eu.org/html/markets.html
      15:45:36 - [DOWNLOAD] Downloading file http://www.aac-eu.org/html/food.html
      15:45:36 - [DOWNLOAD] Downloading file http://www.aac-eu.org/html/non-food.html
      15:45:36 - [DOWNLOAD] Downloading file http://www.aac-eu.org/html/feed.html
      15:45:36 - [INDEXED] Indexing http://www.aac-eu.org/html/food.html
      15:45:36 - [INDEXED] Indexing http://www.aac-eu.org/html/non-food.html
      15:45:36 - [INDEXED] Indexing http://www.aac-eu.org/html/feed.html
      15:45:36 - [FILEIO] All index files will be written to: C:\Users\Paul\Documents\My Clients\Web Sites\Fusion10\AAF 2009\ZoomSearch
      15:45:36 - [FILEIO] Writing index data for PHP search... (Please wait)
      15:45:36 - [FILEIO] Created pagedata data file (zoom_pagedata.zdat)
      15:45:36 - [FILEIO] Created pagetext data file (zoom_pagetext.zdat)
      15:45:36 - [FILEIO] Created pageinfo data file (zoom_pageinfo.zdat)
      15:45:36 - [FILEIO] Created spelling data file (zoom_spelling.zdat)
      15:45:36 - [FILEIO] Created dictionary data file (zoom_dictionary.zdat)
      15:45:36 - [FILEIO] Created wordmap data file (zoom_wordmap.zdat)
      15:45:36 - [FILEIO] Created script settings file (settings.php)
      15:45:36 - Indexing completed at Sat Feb 13 15:45:36 2010
      15:45:36 - INDEX SUMMARY
      15:45:36 - Files indexed: 10
      15:45:36 - Files skipped: 205
      15:45:36 - Files filtered: 0
      15:45:36 - Files downloaded: 10
      15:45:36 - Unique words found: 300
      15:45:36 - Total words found: 460
      15:45:36 - Avg. unique words per page: 30.00
      15:45:36 - Avg. words per page: 46
      15:45:36 - Start index time: 15:45:36 (2010/02/13)
      15:45:36 - Elapsed index time: 00:00:00
      15:45:36 - Errors: 0
      15:45:36 - URLs visited by spider: 10
      15:45:36 - URLs in spider queue: 0
      15:45:36 - Total bytes scanned/downloaded: 136365
      15:45:36 - File extensions:
      15:45:36 - .htm indexed: 0
      15:45:36 - .html indexed: 10
      15:45:36 - .cfm indexed: 0
      15:45:36 - .txt indexed: 0
      15:45:36 - .pdf indexed: 0
      15:45:36 - No extensions indexed: 0
      15:45:36 - Cleaning up memory used for index data... please wait.
      15:45:36 - Finished cleaning up memory.
      15:45:36 - [FILEIO] Created text sitemap file (urllist.txt)
      15:45:36 - [FILEIO] Created XML sitemap index file (sitemap_index.xml)
      15:45:36 - [FILEIO] Created XML sitemap file (sitemap.xml)


      Paul

      P.S. I don't think the Instant Email Notification is working. Didn't get an email - at least not yet anyway.
      Last edited by Adendum; Feb-13-2010, 03:48 PM. Reason: Added spider log

      Comment


      • #4
        I think I have found the answer!!! Doh!

        I just realised there is an extra option in the configuration that I never knew was there - "Index page and follow all links"...I was using the default setting.

        The latest spider found the PDF files so it must be working!!!

        I'm a hapy bunny now

        Comment


        • #5
          I think I have found the answer
          OK.
          I don't know what question the answer answers, but if it is now working all is good.

          Comment


          • #6
            Hi!

            I am not sure this topic is relevant towards the latest version of ZSE, but I also struggle with ZSE not being able to index the content inside an iframe (which is a normal .html file). I can't find the option Adendum is refering to either - please give me a hint on wheter I should provide more info, or if this option exists elsewhere and can be of help to me?

            Thanks!

            Comment


            • #7
              When Zoom (in Spider Mode only) indexes a page within an IFRAME, it will index the content within the IFRAME.

              if it isn't working as expected on your site, then you should first check the log file for errors e.g. the IFRAME points to a file outside of your domain name, and you are only indexing files in your domain (i.e. you don't want to let the spider escape and index the entire internet).

              The indexing option referred to above was probably this one.
              Click image for larger version

Name:	zoom-indexing-options.png
Views:	235
Size:	391.6 KB
ID:	37932

              Comment

              Working...
              X