Incorrect sorting of the Scandinavian alphabets |
Incorrect sorting of the Scandinavian alphabets |
Christian J |
Oct 23 2006, 06:11 PM
Post
#1
|
. Group: WDG Moderators Posts: 9,661 Joined: 10-August 06 Member No.: 7 |
Not only PHP sorts the Swedish letters å, ä and ö incorrectly, now I noticed that javascript does the same, and also in Danish and Norwegian. The arrays below should be in the correct order for each language:
CODE window.onload=function() { var se=['å','ä','ö']; // Swedish var dk=['æ','ø','å']; // Danish, apparently same as Norwegian alert(se.sort()); alert(dk.sort()); } Note that Danish and Norwegian use a different order than Swedish. But in the sorted javascript alerts the Swedish letters are incorrectly sorted as "ä,å,ö", while Danish and Norwegian are (again incorrectly) sorted as "å,æ,ø". The same error appear in IE, Opera and Firefox. At least Opera's Norwegian creators should know their own alphabet, so am I correct in assuming that all three browser vendors deliberately follow some flawed convention? |
Liam Quinn |
Oct 23 2006, 08:03 PM
Post
#2
|
WDG Founder Group: Root Admin Posts: 52 Joined: 2-August 06 From: Canada Member No.: 1 |
The default sort algorithm in JavaScript is based purely on the Unicode code point. If you want a locale-sensitive sort order, you can use this:
CODE function localeSort(string1, string2) { return string1.toString().localeCompare(string2.toString()); } var se=['å','ä','ö']; // Swedish var dk=['æ','ø','å']; // Danish, apparently same as Norwegian alert(se.sort(localeSort)); alert(dk.sort(localeSort)); That should use the locale configured on the user's system. If you want to use a specific locale regardless of the user's locale, I think you're stuck with writing the code for the locale-specific rules yourself in the function you pass to sort(). |
Christian J |
Oct 24 2006, 07:46 AM
Post
#3
|
. Group: WDG Moderators Posts: 9,661 Joined: 10-August 06 Member No.: 7 |
The default sort algorithm in JavaScript is based purely on the Unicode code point. According to wikipedia the first 256 code points are identical to ISO 8859-1, and there you can indeed find "ä" before "å". QUOTE If you want a locale-sensitive sort order, you can use this: CODE function localeSort(string1, string2) { return string1.toString().localeCompare(string2.toString()); } var se=['å','ä','ö']; // Swedish var dk=['æ','ø','å']; // Danish, apparently same as Norwegian alert(se.sort(localeSort)); alert(dk.sort(localeSort)); That should use the locale configured on the user's system. Do you mean the user's OS or browser language settings? On my Swedish Win XP it seems to work in IE6 and Firefox, but Opera sorts like before (despite claiming to support the localeCompare() method from Op7). QUOTE If you want to use a specific locale regardless of the user's locale... Regarding usability: what if a non-Swedish user reads a Swedish web page, wouldn't they (as I believe) expect letters to be sorted according to their own habit? E.g., wouldn't a typical English-speaking user expect "å" and "ä" to be treated as "a", and "ö" to be treated as "o"? |
pandy |
Oct 25 2006, 06:02 PM
Post
#4
|
🌟Computer says no🌟 Group: WDG Moderators Posts: 20,733 Joined: 9-August 06 Member No.: 6 |
According to wikipedia the first 256 code points are identical to ISO 8859-1, and there you can indeed find "ä" before "å". It's only ASCII characters that are encoded the same in Unicode, isn't it? ÅÄÖ are 0197, 0196 and 0214 in Unicode so indeed Ä comes first. |
Lo-Fi Version | Time is now: 27th April 2024 - 10:02 PM |