The Web Design Group

... Making the Web accessible to all.

Welcome Guest ( Log In | Register )

> Incorrect sorting of the Scandinavian alphabets
Christian J
post Oct 23 2006, 06:11 PM
Post #1


.
********

Group: WDG Moderators
Posts: 9,630
Joined: 10-August 06
Member No.: 7



Not only PHP sorts the Swedish letters å, ä and ö incorrectly, now I noticed that javascript does the same, and also in Danish and Norwegian. The arrays below should be in the correct order for each language:

CODE
window.onload=function()
{    
    var se=['å','ä','ö']; // Swedish
    var dk=['æ','ø','å']; // Danish, apparently same as Norwegian
    
    alert(se.sort());
    alert(dk.sort());    
}


Note that Danish and Norwegian use a different order than Swedish. But in the sorted javascript alerts the Swedish letters are incorrectly sorted as "ä,å,ö", while Danish and Norwegian are (again incorrectly) sorted as "å,æ,ø". The same error appear in IE, Opera and Firefox. At least Opera's Norwegian creators should know their own alphabet, so am I correct in assuming that all three browser vendors deliberately follow some flawed convention?
User is offlinePM
Go to the top of the page
Toggle Multi-post QuotingQuote Post
 
Reply to this topicStart new topic
Replies
Liam Quinn
post Oct 23 2006, 08:03 PM
Post #2


WDG Founder
***

Group: Root Admin
Posts: 52
Joined: 2-August 06
From: Canada
Member No.: 1



The default sort algorithm in JavaScript is based purely on the Unicode code point. If you want a locale-sensitive sort order, you can use this:

CODE

function localeSort(string1, string2) {
  return string1.toString().localeCompare(string2.toString());
}

var se=['å','ä','ö']; // Swedish
var dk=['æ','ø','å']; // Danish, apparently same as Norwegian

alert(se.sort(localeSort));
alert(dk.sort(localeSort));


That should use the locale configured on the user's system. If you want to use a specific locale regardless of the user's locale, I think you're stuck with writing the code for the locale-specific rules yourself in the function you pass to sort().
User is offlinePM
Go to the top of the page
Toggle Multi-post QuotingQuote Post
Christian J
post Oct 24 2006, 07:46 AM
Post #3


.
********

Group: WDG Moderators
Posts: 9,630
Joined: 10-August 06
Member No.: 7



QUOTE(Liam Quinn @ Oct 24 2006, 03:03 AM) *

The default sort algorithm in JavaScript is based purely on the Unicode code point.

According to wikipedia the first 256 code points are identical to ISO 8859-1, and there you can indeed find "ä" before "å".

QUOTE
If you want a locale-sensitive sort order, you can use this:
CODE

function localeSort(string1, string2) {
  return string1.toString().localeCompare(string2.toString());
}

var se=['å','ä','ö']; // Swedish
var dk=['æ','ø','å']; // Danish, apparently same as Norwegian

alert(se.sort(localeSort));
alert(dk.sort(localeSort));


That should use the locale configured on the user's system.

Do you mean the user's OS or browser language settings? On my Swedish Win XP it seems to work in IE6 and Firefox, but Opera sorts like before (despite claiming to support the localeCompare() method from Op7).

QUOTE
If you want to use a specific locale regardless of the user's locale...

Regarding usability: what if a non-Swedish user reads a Swedish web page, wouldn't they (as I believe) expect letters to be sorted according to their own habit? E.g., wouldn't a typical English-speaking user expect "å" and "ä" to be treated as "a", and "ö" to be treated as "o"?
User is offlinePM
Go to the top of the page
Toggle Multi-post QuotingQuote Post
Liam Quinn
post Oct 24 2006, 08:05 PM
Post #4


WDG Founder
***

Group: Root Admin
Posts: 52
Joined: 2-August 06
From: Canada
Member No.: 1



QUOTE(Christian J @ Oct 24 2006, 08:46 AM) *

QUOTE
If you want a locale-sensitive sort order, you can use this:
CODE

function localeSort(string1, string2) {
  return string1.toString().localeCompare(string2.toString());
}

var se=['å','ä','ö']; // Swedish
var dk=['æ','ø','å']; // Danish, apparently same as Norwegian

alert(se.sort(localeSort));
alert(dk.sort(localeSort));


That should use the locale configured on the user's system.

Do you mean the user's OS or browser language settings?


I think that's up to the browser implementation.

QUOTE

Regarding usability: what if a non-Swedish user reads a Swedish web page, wouldn't they (as I believe) expect letters to be sorted according to their own habit? E.g., wouldn't a typical English-speaking user expect "å" and "ä" to be treated as "a", and "ö" to be treated as "o"?


If the page is in Swedish, I think you should assume that the reader knows Swedish and that Swedish sorting rules are appropriate.
User is offlinePM
Go to the top of the page
Toggle Multi-post QuotingQuote Post

Posts in this topic


Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



- Lo-Fi Version Time is now: 28th March 2024 - 11:55 PM