This article will explore popular sorting algorithms, their principles, and their Python implementations. We will also compare their efficiency in sorting elements in a list.
As a general example, let's consider sorting numbers in ascending order. However, keep in mind that these methods can be easily adapted to suit your specific needs.
Bubble Sort is a simple algorithm that iterates through a list, comparing elements in pairs and swapping them until larger elements rise to the top, and smaller ones sink to the bottom.
First, the first two elements of the list are compared. If the first element is larger, they are swapped. If they are already in the right order, they remain as is. Then, we move on to the next pair of elements, compare their values, and swap them if necessary. This process continues until the last pair of elements in the list.
When the end of the list is reached, the process is repeated for each element. This can be highly inefficient if only a single exchange is needed in the array, as the algorithm repeats n² times, even if the list is already sorted.
To optimize the algorithm, you need to know when to stop it, i.e., when the list is sorted. To stop the algorithm after sorting is completed, you need to use a flag variable. When values are swapped, set the flag to True to repeat the sorting process. If no swaps occur, the flag remains False, and the algorithm stops.
def bubble_sort(nums): # Set swapped to True so that the loop runs at least once swapped = True while swapped: swapped = False for i in range(len(nums) - 1): if nums[i] > nums[i + 1]: # Swap elements nums[i], nums[i + 1] = nums[i + 1], nums[i] # Set swapped to True for the next iteration swapped = True
To check if it works:
random_list_of_nums = [5, 2, 1, 8, 4] bubble_sort(random_list_of_nums) print(random_list_of_nums)
The algorithm runs in a while loop and breaks when the elements are no longer swapped. Initially, we set swapped to True to ensure the algorithm runs at least once.
In the worst case (initially sorted in descending order), the time complexity is O(n²), where n is the number of elements in the list.
Selection Sort segments the list into two parts: sorted and unsorted. It repeatedly selects the smallest element from the unsorted part and adds it to the sorted part.
In practice, there's no need to create a new list for sorted elements. The leftmost part of the list serves as the sorted part. The algorithm finds the smallest element and swaps it with the first element.
As the value of i increases, fewer elements need to be checked.
def selection_sort(nums): # The i value corresponds to the number of sorted values for i in range(len(nums)): # Initially consider the first element as the smallest lowest_value_index = i # This loop iterates through unsorted elements for j in range(i + 1, len(nums)): if nums[j] < nums[lowest_value_index]: lowest_value_index = j # Swap the smallest element with the first one in the list nums[i], nums[lowest_value_index] = nums[lowest_value_index], nums[i]
To check if it works:
random_list_of_nums = [12, 8, 3, 20, 11] selection_sort(random_list_of_nums) print(random_list_of_nums)
The average time complexity for selection sort is O(n²), where n is the number of elements in the list.
Insertion Sort, like selection sort, segments the list into two parts: sorted and unsorted. The algorithm iterates over the unsorted segment and inserts the current element into the correct position in the sorted segment.
The first element of the list is assumed to be sorted. Moving on to the next element, denoted as x, if x is greater than the first element, it stays in place. If it's smaller, it's copied to the second position, and x becomes the first element.
As we move to other elements in the unsorted segment, larger elements in the sorted segment are shifted up the list until we find an element less than x or reach the end of the list. In the first case, x is placed in the correct position.
def insertion_sort(nums): # Start sorting from the second element since the first is already considered sorted for i in range(1, len(nums)): item_to_insert = nums[i] # Save a link to the index of the previous element j = i - 1 # Move elements of the sorted segment forward if they are larger than the element to insert while j >= 0 and nums[j] > item_to_insert: nums[j + 1] = nums[j] j -= 1 # Insert the element nums[j + 1] = item_to_insert
To check if it works:
random_list_of_nums = [9, 1, 15, 28, 6] insertion_sort(random_list_of_nums) print(random_list_of_nums)
The average time complexity for insertion sort is O(n²), where n is the number of elements in the list.
Heap sort, also known as heap sort, is a popular algorithm that, like insertion and selection sorts, segments a list into two parts: sorted and unsorted. The algorithm converts the second segment of the list into a heap data structure so that the largest element can be efficiently determined.
First, we transform the list into a Max Heap - a binary tree, where the largest element is the top of the tree. Then we place this element at the end of the list. Next, we rebuild the Max Heap and again place the new largest element before the last element in the list. This heap-building process is repeated until all tree vertices have been removed.
Let's create a helper function
heapify() to implement this algorithm:
def heapify(nums, heap_size, root_index): # The index of the largest element is considered the root index largest = root_index left_child = (2 * root_index) + 1 right_child = (2 * root_index) + 2 # If the left child of the root is a valid index and the element is greater, # than the current largest, update the largest element if left_child < heap_size and nums[left_child] > nums[largest]: largest = left_child # Same for the right child of the root if right_child < heap_size and nums[right_child] > nums[largest]: largest = right_child # If the largest element is no longer the root element, they are swapped if largest != root_index: nums[root_index], nums[largest] = nums[largest], nums[root_index] # Heapify the new root element to ensure it's the largest heapify(nums, heap_size, largest) def heap_sort(nums): n = len(nums) # Create Max Heap from the list # The second argument means the algorithm stops before element -1, i.e. # before the first element of the list # 3rd argument means iterate through the list in the opposite direction, # decreasing counter i by 1 for i in range(n, -1, -1): heapify(nums, n, i) # Move the root of Max Heap to the end of the list for i in range(n - 1, 0, -1): nums[i], nums = nums, nums[i] heapify(nums, i, 0)
Check that it works
Here's an example of using heap sort to sort a random list of numbers:
# Check that it works random_list_of_nums = [35, 12, 43, 8, 51] heap_sort(random_list_of_nums) print(random_list_of_nums)
On average, heap sort time complexity is O(n log n), which is already significantly faster than previous sorting algorithms.
This algorithm belongs to the “divide and conquer” algorithms. It splits the list into two parts, each of them it splits into two more, etc. The list is split in half until there are only single elements left.
Adjacent elements become sorted pairs. These pairs are then combined and sorted with other pairs. This process continues until all elements are sorted.
The list is split in half recursively until the end result is lists of one element in size. An array of one element is considered ordered. Adjacent elements are compared and joined together. This continues until a complete sorted list is obtained.
Sorting is done by comparing the smallest elements of each subarray. The first elements of each subarray are compared first. The smallest element is moved into the resulting array. The counters of the resulting array and the subarray from which the element was taken are increased by 1.
def merge(left_list, right_list): sorted_list =  left_list_index = right_list_index = 0 # The length of lists is often used, so let's create variables for convenience left_list_length, right_list_length = len(left_list), len(right_list) for _ in range(left_list_length + right_list_length): if left_list_index < left_list_length and right_list_index < right_list_length: # Compare the first elements at the beginning of each list # If the first element of the left sublist is smaller, add it # into a sorted array if left_list[left_list_index] <= right_list[right_list_index]: sorted_list.append(left_list[left_list_index]) left_list_index += 1 # If the first element of the right sublist is smaller, add it # into a sorted array else: sorted_list.append(right_list[right_list_index]) right_list_index += 1 # If the end of the left list is reached, the elements of the right list # add to the end of the resulting list elif left_list_index == left_list_length: sorted_list.append(right_list[right_list_index]) right_list_index += 1 # If the end of the right list is reached, the elements of the left list # add to sorted array elif right_list_index == right_list_length: sorted_list.append(left_list[left_list_index]) left_list_index += 1 return sorted_list def merge_sort(nums): # Return a list if it consists of one element if len(nums) <= 1: return numbers # To find the middle of the list, use division without remainder # Indexes must be integer mid = len(nums) // 2 # Sort and merge sublists left_list = merge_sort(nums[:mid]) right_list = merge_sort(nums[mid:]) # Combine sorted lists into the resulting list return merge(left_list, right_list)
Check that it works:
random_list_of_nums = [120, 45, 68, 250, 176] random_list_of_nums = merge_sort(random_list_of_nums) print(random_list_of_nums)
Please note that the merge_sort() function, unlike previous algorithms, returns a new list rather than sorting an existing one. Therefore, this sort requires more memory to create a new list of the same size as the input list.
On average, merge sort takes O(n log n) time.
This algorithm also belongs to the “divide and conquer” algorithms. It is used more often than other algorithms described in this article. When configured correctly, it is extremely efficient and does not require additional memory, unlike merge sort. The array is divided into two parts on opposite sides of the supporting element. During the sorting process, elements smaller than the reference are placed in front of it, and equal or larger elements are placed behind it.
Quick sort begins by splitting the list and selecting one of the elements as a reference. And we move everything else so that this element falls into place. All elements smaller than it are moved to the left, and equal and larger elements are moved to the right.
def partition(nums, low, high): # Select the middle element as the reference element # It is also possible to select first, last # or arbitrary elements as reference pivot = nums[(low + high) // 2] i = low - 1 j = high + 1 while True: i += 1 while nums[i] < pivot: i += 1 j -= 1 while nums[j] > pivot: j -= 1 if i >= j: return j # If the element at index i (to the left of the reference) is greater than # element with index j (to the right of the reference), swap them nums[i], nums[j] = nums[j], nums[i] def quick_sort(nums): # Create a helper function that is called recursively def _quick_sort(items, low, high): if low < high: # This is the index after the pivot, where our lists are split split_index = partition(items, low, high) _quick_sort(items, low, split_index) _quick_sort(items, split_index + 1, high) _quick_sort(nums, 0, len(nums) - 1)
Check that it works:
random_list_of_nums = [22, 5, 1, 18, 99] quick_sort(random_list_of_nums) print(random_list_of_nums)
On average, the execution time of quicksort is O(n log n).
Note that the quicksort algorithm will run slowly if the pivot element is equal to the smallest or largest element of the list. Under these conditions, unlike heap and merge sorts, both of which have worst-case sort time of O(n log n), quicksort will perform O(n²) in the worst case.
Built-in Sorting Functions in Python
Sometimes it is useful to know the algorithms listed above, but in most cases the developer will most