PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

Two questions from a newbie: <!-- in .php pages and ALT text indexing

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Two questions from a newbie: <!-- in .php pages and ALT text indexing

    Two (possibly stupid) questions from this new user:

    1. I have ticked "ALT text" in the indexing options, but I don't get any of my ALT descriptions when I do a search. What might I be doing wrong?

    2. I want to use <!--ZOOMSTOPFOLLOW--> and <!--ZOOMRESTARTFOLLOW--> but in a .php file rather than a .html file. The Apache (v1.3.27) server seems to be stripping these comments out so that they don't get to the browser. Any idea how I configure it to let them through?

    Rob

  • #2
    1.) Check the following:

    i) Is your ALT text within a <!--ZOOMSTOP--> and <!--ZOOMRESTART--> section of the page?
    ii) Have you re-indexed after turning on the ALT text option, and also re-uploaded the search files?
    iii) Did you add the ALT text after your first index, and tried to immediately reindex (if so, turn on the "Reload all pages (do not use cache)" option on the General tab of the configuration window to avoid indexing a cached copy of the page).
    iv) Is your ALT text specified correctly in HTML?

    2.) There is no difference to using those tags in PHP or HTML files. And there is no default Apache setting that we are aware of to strip out HTML comments. Doing a quick search on the web points to some optional Apache modules which offer such functionality. I would recommend consulting your web host or server administrator if you are certain this is happening.
    --Ray
    Wrensoft Web Software
    Sydney, Australia
    Zoom Search Engine

    Comment


    • #3
      Thanks for the quick reply.

      > i) Is your ALT text within a <!--ZOOMSTOP--> and <!--ZOOMRESTART--> section of the page?

      No.

      > ii) Have you re-indexed after turning on the ALT text option, and also re-uploaded the search files?

      Yes.

      > iii) Did you add the ALT text after your first index, and tried to immediately reindex (if so,
      > turn on the "Reload all pages (do not use cache)" option on the General tab of the
      > configuration window to avoid indexing a cached copy of the page).

      No, it was there from the outset.

      > iv) Is your ALT text specified correctly in HTML?

      Yes: as an example, see http://www.meades.org/moths/moths_13..._13-07-03.html and look at the entry for Scalloped Oak:

      <a href="scalloped_oak_0.jpg" name="Scalloped_Oak"><img border="1" src="scalloped_oak_0.jpg" width="254" height="179" hspace="2" vspace="2" alt="Scalloped Oak"></a>

      When I search for "Scalloped Oak" (using the search box here:
      http://www.meades.org/moths/moths.html) it doesn't list that page.

      What steps do you think I could take to debug the problem?

      > 2.) There is no difference to using those tags in PHP or HTML files ... I would recommend consulting your
      > web host or server administrator if you are certain this is happening.

      Will do.

      Rob

      Comment


      • #4
        OK, I see why it's not showing up for you now.

        First of all, the main reason is that ALT text is not actually indexed for the HTML page that the text resides on. Currently, Zoom is designed to only use ALT text for the image files indexed. That is, it will only associate the ALT text used to describe an image with the actual JPG file (not the HTML page where the image appears), and only if that JPG file is indexed.

        Since you do not have .JPG files enabled for indexing, these ALT text are not currently indexed at all.

        We realize that this method may not be suitable for everyone's use, as some people would like the ALT text to be associated with the page instead of the image file. However, there is no easy change to achieve this, it would require more serious data restructuring.

        The ALT text feature was added primarily to retrieve searchable text data for image files, as many image files lack meta data to search on. We can see the potential for confusion though, and will try to update our documentation to make this clearer.

        Having said that, we went ahead and tried to index the page you pointed us to, with ".jpg" enabled for indexing, to see if the ALT text would be indexed (as we would expect).Unfortunately, it still did not. And it took us another second to realize why.

        This following bit is for your information, should you wish to proceed with enabling image indexing (or for anyone else who might be doing image indexing and coming across this issue). This is the HTML for the link to your image file:

        Originally posted by Rob Meades View Post
        <a href="scalloped_oak_0.jpg" name="Scalloped_Oak"><img border="1" src="scalloped_oak_0.jpg" width="254" height="179" hspace="2" vspace="2" alt="Scalloped Oak"></a>
        Note that what you actually have here is a slightly less common scenario, where you are using the exact, full-sized image file as a "thumbnail" (as opposed to a proper thumbnail which is a smaller version of the file to avoid slow downloads), and you are using it as a link which points to the very same file (which the browser will then display in its full size). This is typically not recommended as a web design rule, as the whole point of having thumbnails is to allow many images to load on one page quickly and allow the visitor to select which image they want to download at full size. Instead, your visitors are downloading all the images at full size on your page, and then the browser only (badly) resizes them for displaying purposes when it renders the web page.

        But anyway, the problem here is that, Zoom will find the first link (the '<a href="scalloped_oak_0.jpg" name="Scalloped_Oak">' part) and add that to its queue for indexing. Naturally, as an anchor link, it has no ALT text or link text to be used for indexing the JPG file.

        It will then find the '<img border="1" src="scalloped_oak_0.jpg" width="254" height="179" hspace="2" vspace="2" alt="Scalloped Oak">' tag, which contains ALT text, but at this point, the first anchor tag is already queued, and it decides that this is a link to a file we've already processed. This ALT text is then lost.

        This same issue can occur when an image appears on multiple pages of your website, and some instances of it, the image appears without ALT text, and on some other pages, it appears with ALT text. Zoom will only use the ALT information (or the lack thereof) from the first link that it comes across.

        Hope that clears things up for you or anyone else having similar issues. It is an area we plan to look into updating in the future (maybe V6?).
        --Ray
        Wrensoft Web Software
        Sydney, Australia
        Zoom Search Engine

        Comment


        • #5
          I ended up with the construction you highlight because that's what Microsoft Frontpage does if you fill in the "Picture source" and "Default hyperlink" with the same image. This allows me to fit all the pictures on a smaller page (with re-sizing), allows a subsequent click to display the full picture and prevents me having to store thumbnails for everything.

          Looks like this is going to be a problem: I'd specifically looked for ALT text support in the tool as that's the way I name the moths in each picture. Without it the names of the moths aren't going to be picked-up in a search. Are there no other workarounds? When might v6 be due?

          Rob

          Comment


          • #6
            Originally posted by Rob Meades View Post
            I ended up with the construction you highlight because that's what Microsoft Frontpage does if you fill in the "Picture source" and "Default hyperlink" with the same image.
            Yes, and this is not a recommended method in web design, which is why Frontpage offers your two different fields - because it's generally better to have a smaller thumbnail image, and a link to the fullsize file in such situations.

            Originally posted by Rob Meades View Post
            This allows me to fit all the pictures on a smaller page (with re-sizing), allows a subsequent click to display the full picture and prevents me having to store thumbnails for everything.
            Yes, and the downside to this approach is that all your images are being downloaded in fullsize, just by visiting the page that they appear on. It makes your pages very slow to load. You may not notice this if you are viewing it from your own computer since they are all cached in advance. But this page for example: http://www.meades.org/moths/moths_13..._13-07-03.html took half a minute to load on our side.

            Originally posted by Rob Meades View Post
            Looks like this is going to be a problem: I'd specifically looked for ALT text support in the tool as that's the way I name the moths in each picture. Without it the names of the moths aren't going to be picked-up in a search. Are there no other workarounds? When might v6 be due?
            The solution we would recommend is to change your web pages to use proper thumbnails, and enable image indexing (you will need to add ".jpg" to your scan list, and you will need the Image plugin installed).

            Other solutions would also involve you changing your website, for example, putting each image on a page of its own with a title or description. Or adding text captions next to your images.

            Another possibility is adding meta data to each of your image files and using image indexing. See this FAQ for more details on our image indexing features:
            http://www.wrensoft.com/zoom/support...ins_image.html

            There is no date for V6 at present.
            --Ray
            Wrensoft Web Software
            Sydney, Australia
            Zoom Search Engine

            Comment


            • #7
              Since most of the images are small in any case, the additional download time is a penalty I'm prepare to bare. In order to get exactly the layout I want I generate these pages by hand, so I don't relish the prospect of re-editing them to use thumbnails to fit the tool behaviour. My previous search engine did ALT indexing in the way I wanted, but I'd hit a 250 page limit and hence went for your product.

              Looks like I'll have to re-edit the pages with captions or something similar if I'm going to stick with Zoom. No timescale or spec for v6?

              Rob

              Comment


              • #8
                Nothing concrete for V6 yet no, V5 is still our focus at the moment.

                We are considering a change to allow ALT text to be indexed for the HTML page it is on, rather than the image file it is associated with, when images are not enabled for indexing. This might be something we can put into an upcoming build or a future minor release. I'll update when I have more information.
                --Ray
                Wrensoft Web Software
                Sydney, Australia
                Zoom Search Engine

                Comment


                • #9
                  Many thanks - let me know if you want me to try anything.

                  Rob

                  Comment

                  Working...
                  X