Hello everybody,
I want to index webpage’s that includes cyrllic (<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="ru">) as UTF-8? Is it possible somehow? I tried to index it as UTF-8 directly but it didn’t work of course.
The problem is that Im going to put some of the result in a mysql-db and it would be a lot easier if I get the result in UTF-8. Especially since Im indexing several webpage’s including different languages.
I do not have access to all the webpage’s that I’m trying to index (can’t edit them)...
Does someone have suggestions how to solve this problem or to proceed?
Regards
J Gru
I want to index webpage’s that includes cyrllic (<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="ru">) as UTF-8? Is it possible somehow? I tried to index it as UTF-8 directly but it didn’t work of course.
The problem is that Im going to put some of the result in a mysql-db and it would be a lot easier if I get the result in UTF-8. Especially since Im indexing several webpage’s including different languages.
I do not have access to all the webpage’s that I’m trying to index (can’t edit them)...
Does someone have suggestions how to solve this problem or to proceed?
Regards
J Gru
Comment