How to sort an array of strings alphabetically with special characters properly with JavaScript

Developers in countries where english is not the native language, usually have a lot of problems with special characters and words with accents like:

  • The cédille (cedilla) Ç ...
  • The accent aigu (acute accent) é ...
  • The accent circonflexe (circumflex) â, ê, î, ô, û ...
  • The accent grave (grave accent) à, è, ù ...
  • The accent tréma (dieresis/umlaut) ë, ï, ü

In JavaScript for example, sorting an array of words is pretty easy for strings that doesn't contain such characters, for example:

['Bogotá', 'Bucaramanga', 'Cali', 'Santa Marta', 'Cartagena'].sort();
// This will sort as
//  ["Bogotá", "Bucaramanga", "Cali", "Cartagena", "Santa Marta"]

The sort function of JavaScript does the trick automatically for you. Pitifully, when you sort words with the mentioned characters as first character, for examples with words in German, you will obtain weird results:

['Bären', 'küssen', 'Käfer', 'Ähnlich', 'Äpfel'].sort();
// This will sort as
// ["Bären", "Käfer", "küssen", "Ähnlich", "Äpfel"]

Poor germans ... this isn't what everyone expects as "alphabetical" orders, instead we would like to have as result ["Ähnlich", "Äpfel", "Bären", "Käfer", "küssen"]. The solution for this issue is pretty simple though, and it relies as well on the native sort function of JavaScript, but instead we will modify the compareFunction that receives as first argument. The first argument of the sort function specifies a function that defines the sort order. If omitted, the array elements are converted to strings, then sorted according to each character's Unicode code point value.

Solution using localeCompare

The first option to sort properly an array of strings, is to provide as comparator the localeCompare method of a string, this method returns a number indicating whether a reference string comes before or after or is the same as the given string in sort order. For example:

['Bären', 'küssen', 'Käfer', 'Ähnlich', 'Äpfel'].sort(function (a, b) {
    return a.localeCompare(b);
});

// This sorts as:
// ["Ähnlich", "Äpfel", "Bären", "Käfer", "küssen"]

Solution using Intl.Collator

The second option to sort an array of strings with special characters, is to use the Intl.Collator object as comparator. This object is a constructor for collators, objects that enable language sensitive string comparison. For example:

['Bären', 'küssen', 'Käfer', 'Ähnlich', 'Äpfel '].sort(Intl.Collator().compare);
// This will sort as:
// ["Ähnlich", "Äpfel ", "Bären", "Käfer", "küssen"]

According to some tests, the Intl.Collator implementation ends up being a lot faster than localeCompare when comparing a large amount of strings.

Happy coding !

This could interest you

Become a more social person