PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

Language Issue - Greek language & Javascript

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Language Issue - Greek language & Javascript

    Hi,

    I have been using Search Engine 4.2 to make search pages for CD-Roms that I distribute to my students and there was no problem at all. The indexed material was html pages written in the English language.

    However, I recently tried to do the same, with html pages that are written in the Greek language. More specifically, all the html pages that I want to index contain the following tag in the HEAD of the html page:
    <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-7">

    In the configuration, under the Languages tab, I have specified the encoding to be ISO-8859-7 but there were problems (explained below). I have also tried every possible combination of the International Searching Options but again no luck. I opened the zoom_index.js and it actually does contain all the Greek words; in other words, I think that indexing works but not the searching.

    The situation is that if I type something in the Search Textbox (e.g. προγραμματισμός), when I click the submit button, no results are found and the response is:

    Search results for: προγραμματισμός
    Of course the contents of the search textbox also contain the string προγραμματισμός. It somehow changes the text προγραμματισμός to the weird looking string. I am obviously doing something wrong since I have read that it does support the Greek language. Could you please assist me with this matter?

    Thank you in advance.

  • #2
    I don't think we've come across this problem before. We have some Greek users but they may not be using this combination of options (Javascript and iso-8859-7). I have just tested this and confirmed that there is a problem with using the Javascript version to search in iso-8859-7.

    However, I was also able to confirm that it will work when searching for a Greek website in UTF-8. So there are two options:

    a) Convert your Greek webpages to use UTF-8, and change to UTF-8 in your Zoom V4.2 configuration as well.

    b) In Zoom V4.2, your encoding/charset setting in Zoom must match the charset of the web page's meta tag. This is why option (a) requires changing your existing webpages. However, if you do not want to (or cannot) change your existing Greek webpages to UTF-8, consider upgrading to Zoom V5 which will automatically convert your content from the charset of the page being indexed to your setting specified in the Zoom configuration window. This means you can index webpages in iso-8859-7 to produce a UTF-8 search page. We have tested that this will also work with the Greek searches.

    In either cases, make sure you reset all the other "International searching options" that you've played with (eg. accent insensitivity, single case, substring match), in most cases, you probably do not want these to apply for Greek.

    Hope that helps.
    --Ray
    Wrensoft Web Software
    Sydney, Australia
    Zoom Search Engine

    Comment

    Working...
    X