PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

Case insensitive search for page ecoded with windows-1251

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Case insensitive search for page ecoded with windows-1251

    I have a bilungual site encoded with windows-1251. When I indexed the site using the spider mode, the search for the cyrillic pages turned out to be case sensitive i.e. if I perform a search on a word that is in the site with the only difference being the case the word won't be found. Everything is OK with the English part of the site-the search is not case sensitive. An interesting fact is that the Cyrillic search works OK on my local testing server wth all the configuration files taken from the spider search, but not on the remote server. The difference between the two is the OS-the remote has Linux and mine is WindowsXP. Has anyone encountered such a problem? Thanks.

  • #2
    This may be due to a different locale setting on your remote server. Your windows machine is most likely set to windows-1251 / Cyrillic as your default locale, whereas your Linux server is not.

    You can either change the default locale of your remote server accordingly, or you can attempt to set the locale via script. Are you using the PHP, ASP, Javascript or CGI version?
    --Ray
    Wrensoft Web Software
    Sydney, Australia
    Zoom Search Engine

    Comment


    • #3
      Thank you for the reply.
      I'm using the PHP version.

      Comment


      • #4
        In the PHP version, theres a bit of code in search.php under the "Settings" section which is commented out regarding foreign language support. If you are familiar with PHP scripting, you may want to see if uncommenting this function call to set the locale (and changing it to use the corresponding codepage) may help. Otherwise, it may be worth looking at your remote web server configuration.
        --Ray
        Wrensoft Web Software
        Sydney, Australia
        Zoom Search Engine

        Comment


        • #5
          Thank you for the suggestion. It worked instantly. Great search engine.

          Comment


          • #6
            By the way, I noticed the Generated reports have encoding problems too (the cyrillic words in the pie graphs are not legible). The cyrillic words in the table are fine though. I just wonder if there is a quick fix for this problem. Thanks.

            Comment


            • #7
              We just tried this and we were able to get cyrillic words to appear correctly in the pie chart (the color coded legend next to the pie graph). Instead, we found problems with the words appearing in the table at the bottom of the report ("Listing the top 100 search words by popularity"). We will have to look into this.

              Can you send us a URL to your search words log file? You may email or PM us if you wish.
              --Ray
              Wrensoft Web Software
              Sydney, Australia
              Zoom Search Engine

              Comment


              • #8
                We've just looked at the log file you sent us, and have confirmed that it is the same problem (the cyrillic words are not appearing correctly only in the "Listing the top 100... " part of the report).

                This is a bug in the current version of the Indexer. This bug will be fixed for the next version (Version 5.0). Thanks for bringing it to our attention.
                --Ray
                Wrensoft Web Software
                Sydney, Australia
                Zoom Search Engine

                Comment

                Working...
                X