PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

Indexing a site with pages using different charsets

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Indexing a site with pages using different charsets

    Hi there,
    I'm trying to index www.agamayoga.com and would like to know whether i can index the pages we have in Hebrew (http://agamayoga.com/centres/israel/)?

    ...and is there a way to get the spider to index the whole site without having to add most of the pages individually, as the very nice menu system we have is generated with javascript and so no urls are within the code of the page?

    Thanks,
    james

    ps. great product !

  • #2
    If your website relies on links produced in a Javascript navigation menu, and there are no hypertext links to these other pages of your website, the spider will not be able to find them.

    Javascipt is executed on the client side by the bowser. Using Javascript it is to possible to create new links as the Javascript code executes. Some examples are,
    1/ A link might be generated only after the user moves the mouse over a particular area of the screen or enters some data.
    2/ The Javascipt code might create the URL for the link using an algorithm that takes into account other factors such as the date and time, the size of the browser window, security settings in the browser or hundreds of other factors.
    3/ A link gets generated by the Javascript code only 10seconds after the page is downloaded using a timer.

    Zoom does not execute JavaScript. Even if it did execute JavaScript it would fail on the above examples. It is not possible to predict or simulate the user behaviour with the mouse or data entry. So using only JavaScript to generate links will result in those links being invisible to search engines.

    It is also risky HTML coding practice to use only JavaScript links because a lot of people are browsing with JavaScript turned off in their browsers. It is better practice to also include duplicate fixed links at the bottom of each page to the major sections on the web site. This will keep all the search engines and more users happy (Including Zoom). I notice that Google also appears to have a problem indexing parts of your site, probably for the same reason.

    See this page for more details.
    http://www.wrensoft.com/zoom/support...spider_finding

    By contrast, links generated dynamically on the server with CGI's, PHP or ASP will always be OK. This is becasue the code has been fully executed before it gets to the client.

    Hebrew should be OK. Make sure you select the right character set in the Zoom configuration window however (on the languages tab).

    ----
    David

    Comment


    • #3
      If you are trying to index a site which uses various different charsets, this can be a problem. Basically, you will have to use UTF-8 encoding to achieve this, because it is otherwise not possible with two different charsets - if you think about it, the search results page will have to potentially show results from both the English and the Hebrew sections of the site. The only way it can do this is if the search page uses UTF-8 rather than any specific charsets.

      Alternatively, you can create two separate search indexes for the English and Hebrew section of the site. This means that you will not be able to search across both sections of the site at the same time, but this is often not necessary considering how vastly different the two languages are.
      --Ray
      Wrensoft Web Software
      Sydney, Australia
      Zoom Search Engine

      Comment

      Working...
      X