How can I support UTF8 symbols in the sitemaps?
Currently I have installed Ultimate SEO (latest version) and it's working fine with cyrillic symbols at first sight.
When I try to make sitemap the indexed link is something like that: Свещ-Уют-p-183.html
It opens just fine but the link stays that way in the browser and probably google will have problems with it?
Do I have to manually make all files from this contribution UTF8?
I'm running 1.5.1 version of Zen.
http://www.rampagehockey.eu/neo/grap...%B0-p-182.html here's the link for the problematic item with Bulgarian SEO url in browser but with junk in sitemap.
This is a test installation of 1.5.1 to test out things like cyrillic support and modules I'll be using.
Hi,
You have the wrong address.
For example, this URL
must beCode:/neo/graphics-cards-c-1_4/Свещ-Зимна-приказка-p-182.html
rfc3986 Uniform Resource Identifier (URI): Generic Syntax. 2. CharactersCode:/neo/graphics-cards-c-1_4/%D0%A1%D0%B2%D0%B5%D1%89-%D0%97%D0%B8%D0%BC%D0%BD%D0%B0-%D0%BF%D1%80%D0%B8%D0%BA%D0%B0%D0%B7%D0%BA%D0%B0-p-182.html
A good example is the URL's in Wikipedia - http://bg.wikipedia.org/wiki/КирилицаA URI is composed from a limited set of characters consisting of
digits, letters, and a few graphic symbols. A reserved subset of
those characters may be used to delimit syntax components within a
URI while the remaining characters, including both the unreserved set
and those reserved characters not acting as delimiters, define each
component's identifying data.
Please note, the URL is
but not theCode:bg.wikipedia.org/wiki/%D0%9A%D0%B8%D1%80%D0%B8%D0%BB%D0%B8%D1%86%D0%B0
Of course, I can add additional processing URL, make it valid with rfc3986.Code:bg.wikipedia.org/wiki/Кирилица
But:
1. Standard function zen_href_link() is not needed additional processing. It return valid URLs.
2. This will not save you with the troubles indexing your site with an non valid URLs.
OK, If I leave it that way it won't be good for sitemap but I really need those cyrillic letters in the url. What can I do?
OK I understand that all non latin letters get encoded.
But Google understands that and support cyrillic searches and when you search it still gives good results on 'exact match' with links you have in your site so most cyrillic writing people use cyrillic in URLs. Russia, Bulgaria and few more.
It's a common practice to have cyrillic symbols in the links (when you copy/paste they copy/paste as encoded) but when you follow the link it get 'readable' for the user in their browser address bar.
My problem is that in the sitemap it doesn't get the proper link (the encoded version you pasted above) but some hieroglyphs as I pasted in my first post in the thread. If I follow that link the proper page opens but the address is still the same (it doesn't change neither to cyrillic or encoded version of the link).
Bookmarks