Explain Codes LogoExplain Codes Logo

How to sort a list of strings?

python
natural-sorting
unicode-sorting
advanced-sorting-techniques
Nikita BarsukovbyNikita Barsukov·Dec 13, 2024
TLDR

In Python, sort strings using sorted(my_list) or my_list.sort(). Example:

my_list.sort() # Ascending

For a case-insensitive sort, use:

my_list.sort(key=str.lower) # I see no case

And for descending order:

my_list.sort(reverse=True) # Reverse, reverse!

For more complex situations, like locale-aware or natural sorting, tweak your sorting code accordingly.

Advanced sorting techniques

When dealing with multilingual data, certain characters might be tricky to sort. To overcome this, use the locale module.

import locale locale.setlocale(locale.LC_ALL, 'en_US.UTF-8') sorted_list = sorted(my_list, key=locale.strxfrm)

Above code ensures characters from various languages are sorted out just right. For example, 'ä' is snugly fitted between 'a' and 'z' in German as it should. "Vorsprung durch Technik" anyone? 😄

Dealing with Unicode

For Unicode strings, sorted() and my_list.sort() are your friends, but keep in mind they operate based on Unicode code points.

Natural (human-like) sorting

Natural sorting makes Python behave like us humans. For instance, sorting the strings "img12.png" before "img100.png" makes more sense.

import re def natural_sort_key(s): return [int(text) if text.isdigit() else text.lower() for text in re.split('(\d+)', s)] my_list.sort(key=natural_sort_key) # Let Python think like a human

Stability and efficiency of sorting

Python offers stable sorting with its built-in functions, ensuring no wrongful shuffling of equal items. Plus, the Timsort Python uses is pretty spot-on with varying types of data.

Why .lower() isn't enough

Just when you think everything is sorted (pun intended), the .lower() method reminds you that it only works with ASCII characters. For an improved multilingual experience, avoid relying solely on .lower().

Those special special characters

When sorting strings with special characters or even accents, a nifty trick is to apply unicode.normalize() before sorting for consistency.

Length-wise sorting

To sort strings by their length, coupled with alphabetic order, a lambda function can do wonders:

my_list.sort(key=lambda item: (len(item), item)) # Size does matter!

This sorts the list by length first, then alphabetically if lengths happen to be the same.

Practical applications of sorting

Whether it's web development or data science, string sorting is everywhere. It's like sorting usernames or tags alphabetically for a smoother UX or arranging categorical data for analysis — the world is a chaos without sorting! Keep "file management" scripts user-friendly with alphabetized and natural sorting of filenames.

Handling exceptions

Not all lists will play nice. Beware of null or empty strings. Depending on your use-case, pre-process the list to remove or manage such values before sorting.

For a case-preserving but case-insensitive sort, the key parameter along with custom functions will keep your original case intact.