Incorrect sorting of the Scandinavian alphabets |
Incorrect sorting of the Scandinavian alphabets |
Christian J |
Oct 23 2006, 06:11 PM
Post
#1
|
. Group: WDG Moderators Posts: 9,661 Joined: 10-August 06 Member No.: 7 |
Not only PHP sorts the Swedish letters å, ä and ö incorrectly, now I noticed that javascript does the same, and also in Danish and Norwegian. The arrays below should be in the correct order for each language:
CODE window.onload=function() { var se=['å','ä','ö']; // Swedish var dk=['æ','ø','å']; // Danish, apparently same as Norwegian alert(se.sort()); alert(dk.sort()); } Note that Danish and Norwegian use a different order than Swedish. But in the sorted javascript alerts the Swedish letters are incorrectly sorted as "ä,å,ö", while Danish and Norwegian are (again incorrectly) sorted as "å,æ,ø". The same error appear in IE, Opera and Firefox. At least Opera's Norwegian creators should know their own alphabet, so am I correct in assuming that all three browser vendors deliberately follow some flawed convention? |
Darin McGrew |
Oct 23 2006, 06:55 PM
Post
#2
|
WDG Member Group: Root Admin Posts: 8,365 Joined: 4-August 06 From: Mountain View, CA Member No.: 3 |
Does PHP allow you to specify the locale? The default locale is often "C", which sorts characters according to their numeric encoding. Other locales should sort characters as appropriate for that locale.
|
Christian J |
Oct 24 2006, 05:48 AM
Post
#3
|
. Group: WDG Moderators Posts: 9,661 Joined: 10-August 06 Member No.: 7 |
Does PHP allow you to specify the locale? It does, but it seems to be buggy. The entry on http://bugs.php.net/bug.php?id=9671 (10 Mar 2001 1:36pm) suggests something like this, which still sorts in the wrong order (PHP 4.3.3): CODE <?php // Danish letters $dk = array('ø', 'æ', 'å'); setlocale(LC_COLLATE, "dk_DK"); usort($dk, "strcoll"); print_r($dk); // returns "Array ( [0] => å [1] => æ [2] => ø )" echo '<br>'; // Norwegian letters $no = array('ø', 'æ', 'å'); setlocale(LC_COLLATE, "no_NO"); usort($no, "strcoll"); print_r($no); // returns "Array ( [0] => å [1] => æ [2] => ø )" echo '<br>'; // Swedish letters $se = array('å', 'ä', 'ö'); setlocale(LC_COLLATE, "sv_SV"); usort($se, "strcoll"); print_r($se); // returns "Array ( [0] => ä [1] => å [2] => ö )" ?> |
Liam Quinn |
Oct 24 2006, 07:48 PM
Post
#4
|
WDG Founder Group: Root Admin Posts: 52 Joined: 2-August 06 From: Canada Member No.: 1 |
Does PHP allow you to specify the locale? It does, but it seems to be buggy. The entry on http://bugs.php.net/bug.php?id=9671 (10 Mar 2001 1:36pm) suggests something like this, which still sorts in the wrong order (PHP 4.3.3): CODE <?php // Danish letters $dk = array('ø', 'æ', 'å'); setlocale(LC_COLLATE, "dk_DK"); usort($dk, "strcoll"); print_r($dk); // returns "Array ( [0] => å [1] => æ [2] => ø )" echo '<br>'; // Norwegian letters $no = array('ø', 'æ', 'å'); setlocale(LC_COLLATE, "no_NO"); usort($no, "strcoll"); print_r($no); // returns "Array ( [0] => å [1] => æ [2] => ø )" echo '<br>'; // Swedish letters $se = array('å', 'ä', 'ö'); setlocale(LC_COLLATE, "sv_SV"); usort($se, "strcoll"); print_r($se); // returns "Array ( [0] => ä [1] => å [2] => ö )" ?> The user comments in http://ca3.php.net/setlocale may help you determine whether your system has the locales installed. One problem is that you have the Danish and Swedish locale codes wrong: They should be "da_DK" and "sv_SE" (language_COUNTRY). |
Christian J |
Oct 25 2006, 07:04 AM
Post
#5
|
. Group: WDG Moderators Posts: 9,661 Joined: 10-August 06 Member No.: 7 |
One problem is that you have the Danish and Swedish locale codes wrong: They should be "da_DK" and "sv_SE" (language_COUNTRY). The locale codes indeed seem to be the problem. Like http://ca3.php.net/setlocale says, different systems have different naming schemes for locales, but you can apparently use an array of codes. The following works both on my Apache/Windows test server and on my web host's FreeBSD: CODE setlocale(LC_COLLATE, "sve", "sv_SE.ISO8859-1"); But while "nor" and "dan" work for Norwegian and Danish on my Apache/Windows, I haven't been able to make any code work for them on my web host yet. E.g., even though the following echoes "da_DK.ISO8859-1" as the preferred locale, it doesn't sort properly: CODE <?php // Danish letters $dk = array('a', 'b', 'o', 'æ', 'ø', 'å'); setlocale(LC_COLLATE, "dan", "da_DK.ISO8859-1"); echo setlocale(LC_COLLATE, "dan", "da_DK.ISO8859-1").'<br>'; // "da_DK.ISO8859-1" usort($dk, "strcoll"); print_r($dk); // "Array ( [0] => a [1] => å [2] => æ [3] => b [4] => o [5] => ø )" ?> I should add that Norwegian and Danish is not an urgent problem, I'm mostly curious. |
Lo-Fi Version | Time is now: 28th April 2024 - 01:36 PM |