What Is Stable Sorting?
Stable sorting preserves the relative order of elements that compare as equal with respect to the chosen key or comparator. If two items tie on the comparison, a stable sort guarantees that they appear in the output in the same order they appeared in the input.
Why Stability Matters
- Multi-key sorting: Sort by a secondary attribute, then stably sort by a primary attribute. Stability ensures ties on the primary attribute retain the correct secondary order.
- Order as metadata: When input order carries meaning (e.g., time received), stability preserves that information for items with equal keys.
- Pipeline safety: In multi-stage processing, stable sorts let you add or change sort criteria without unintended reordering of ties.
Stable vs. Unstable Algorithms
Commonly stable (typical implementations):
- Merge sort (standard, comparison-based)
- Insertion sort
- Bubble sort
- Counting sort and Radix sort (non-comparison, when implemented with stable passes)
Commonly unstable (typical implementations):
- Quicksort
- Heapsort
- Selection sort
Note: Many algorithms can be made stable with additional memory or different data structures, but this may change constants or implementation complexity.
Library and Language Behavior
- Python:
list.sort() and sorted() are stable (Timsort).
- Java:
Arrays.sort() and Collections.sort() are stable for object types (Timsort); primitive array sorts may differ in behavior.
- C++:
std::stable_sort is stable; std::sort is not required to be stable.
- JavaScript: Modern engines (per ECMAScript 2019+) require a stable
Array.prototype.sort.
How to Achieve Stability
- Use a stable sort: Prefer library calls that guarantee stability (e.g., Python
sorted, C++ std::stable_sort).
- Decorate–sort–undecorate (DSU): Attach a tie-breaker (like original index) to each item, sort by (key, index), then remove the decoration. This simulates stability.
- Stable passes in non-comparison sorts: Radix sort depends on stable passes so that lower-significance digits remain in order when sorting higher-significance digits.
Complexity and Trade-offs
- Time complexity: Stability does not inherently change big-O. For example, merge sort remains O(n log n) whether stable or not.
- Space usage: Stable implementations often use extra memory (e.g., merge sort’s auxiliary array). In-place stability is possible but more intricate.
- Constants matter: Some unstable algorithms (like heapsort) are in-place with small constants and may be preferable when memory is tight and stability is not required.
Examples
Python: Multi-key sort via stability
# Suppose we want to sort by dept (asc) then by score (desc)
records = [
{"id": 1, "dept": "A", "score": 90},
{"id": 2, "dept": "A", "score": 90},
{"id": 3, "dept": "B", "score": 85},
{"id": 4, "dept": "B", "score": 92},
]
# Stable approach: sort by secondary key first, then by primary key.
records = sorted(records, key=lambda r: -r["score"]) # secondary
records = sorted(records, key=lambda r: r["dept"]) # primary (stable)
# Equal dept items keep the score-desc order from the previous pass.
Python: DSU (decorate–sort–undecorate)
items = ["b", "a", "B", "A"]
# Case-insensitive sort that preserves original order of case-insensitive ties
decorated = [(s.lower(), i, s) for i, s in enumerate(items)]
decorated.sort() # not necessarily stable, but index breaks ties deterministically
result = [s for _, _, s in decorated]
C++: Using std::stable_sort
#include <algorithm>
#include <vector>
#include <string>
using Item = std::pair<int, std::string>;
int main() {
std::vector<Item> v = {{90, "A"}, {90, "A2"}, {85, "B"}};
std::stable_sort(v.begin(), v.end(), [](const Item& x, const Item& y){
return x.first < y.first; // ties maintain input order
});
}
Testing Stability
- Sort items containing a key and an original index. After sorting by the key alone, ensure that items with equal keys appear in ascending order of their original index.
Common Pitfalls
- Comparator inconsistencies: Non-transitive or inconsistent comparators can cause surprising results regardless of stability.
- Hidden ties: When keys are derived (e.g., rounding numbers), more ties occur than expected; stability then becomes more important.
- Memory constraints: Choosing a stable algorithm that requires extra space may be impractical for very large datasets; consider external or multi-pass strategies.