Top Ten Sorting Algorithms (Python Implementation)

I. Introduction to Algorithms and Interpretation of Related Concepts

Algorithmic Classification

The ten common sorting algorithms can be divided into two categories:

Nonlinear time comparison sort: The relative order of elements is determined by comparison. Because its time complexity can not break through O(nlogn), it is called non-linear time comparison sort sort.

Linear time non-comparative sort: It can break through the lower bound of time based on comparative sort and run in linear time without comparing to determine the relative order of elements. So it is called linear time non-comparative sort sort.

Relevant concepts

Stability: If A is in front of B and a=b, then a is still in front of B after sorting.

Instability: If A is in front of B and a=b, a may appear behind B after sorting.

Time complexity: Total number of operations on sorted data. Reflect the rule of operation times when n changes.

Spatial complexity: It refers to the measurement of the storage space required for the algorithm to execute in the computer. It is also a function of the data size n.

1. Exchange sort

1.1 Bubble Sort

Compare adjacent elements. If the first one is bigger than the second one, exchange the two.

Do the same work for each pair of adjacent elements, from the first pair at the beginning to the last pair at the end, so that the last element should be the largest number.

Repeat the above steps for all elements except the last one.

Repeat steps 1 to 3 until the sorting is complete.

Bubble sorting operates on n-1 rounds of data to find a maximum (minimum) value for each round.

Operations compare and exchange only two adjacent numbers, each round will exchange a maximum value to the beginning (end) of the data column, like bubbles.

O(n) operations per round, a total of O(n) rounds, time complexity O(n^2).

The extra space costs the transition space when exchanging data, and the space complexity O(1)

def BubbleSort(lst): n=len(lst) if n<=1: return lst for i in range (0,n): for j in range(0,n-i-1): if lst[j]>lst[j+1]: (lst[j],lst[j+1])=(lst[j+1],lst[j]) return lst x=input("Please enter the column to be sorted:\n") y=x.split() arr=[] for i in y: arr.append(int(i)) arr=BubbleSort(arr) #print(arr) print("The sequence is as follows:") for i in arr: print(i,end=' ')

1.2 Quick Sort

Pick out an element from a sequence called pivot.

Reordering sequence, all elements smaller than the benchmark are placed in front of the benchmark, and all elements larger than the benchmark are placed behind the benchmark (the same number can reach either side). After the partition exits, the benchmark is in the middle of the sequence. This is called partition operation.

Recursively, the subordinate sequence of the elements less than the reference value and the subordinate sequence of the elements larger than the reference value are sorted.

Quick sorting is based on selection partition, which is the optimization of simple selection sorting.

Each partition selects the data on both sides of the reference value, and divides the data on both sides in a circular way, similar to the dichotomy.

The overall performance of the algorithm depends on the average degree of partition, i.e. the selection of benchmark values. Many optimization schemes for quick sorting are derived here, and can even be divided into multiple blocks.

If the benchmark value can divide the data into two average blocks, the number of partitions O(logn), each partition traversal compares O(n), time complexity O(nlogn).

The extra space overhead lies in the temporary benchmark value, O(logn) subdivision requires O(logn), space complexity O(logn)

def QuickSort(lst): # This function completes the partition operation def partition(arr, left, right): key = left # Dividing Reference Index,The default number is the first benchmark, which can be optimized. while left < right: # If the number behind the list,Larger or equal to the base number,Then move forward one bit until a number smaller than the base number appears. while left < right and arr[right] >= arr[key]: right -= 1 # If the number at the front of the list,Less or equal to the base number,Then move one bit back until a number larger than the benchmark appears. while left < right and arr[left] <= arr[key]: left += 1 # At this point, a Book larger than the benchmark has been found, and a number smaller than the benchmark has been found, and they will be swapped places. (arr[left], arr[right]) = (arr[right], arr[left]) # When approaching from both sides until the two positions are equal, the left side is exchanged with the same benchmark. (arr[left], arr[key]) = (arr[key], arr[left]) # Returns the index of the current benchmark location return left def quicksort(arr, left, right): if left >= right: return # Zoning from baseline mid = partition(arr, left, right) # Recursive call # print(arr) quicksort(arr, left, mid - 1) quicksort(arr, mid + 1, right) # Principal function n = len(lst) if n <= 1: return lst quicksort(lst, 0, n - 1) return lst print("<<< Quick Sort >>>") x = input("Please enter the column to be sorted:\n") y = x.split() arr = [] for i in y: arr.append(int(i)) arr = QuickSort(arr) # print(arr) print("The sequence is as follows:") for i in arr: print(i, end=' ')

2. Insertion sort

2.1 Insert Sort

Starting with the first element, the element can be considered sorted.

Remove the next element and scan backwards and forwards in the ordered sequence of elements.

If the element (sorted) is larger than the new element, move the element to the next location.

Repeat step 3 until you find that the sorted element is less than or equal to the new element.

After inserting the new element into the location;

Repeat steps 2 to 5.

Simple insertion sort also operates on n-1 rounds, inserting an unsorted tree into the arranged sequence in each round.

At the beginning, the first number is ordered by default, and the remaining n-1 numbers are inserted one by one. The insertion operation includes: comparing and determining the insertion position, and data shifting to vacate the appropriate space.

O(n) operations per round, a total of O(n) rounds, time complexity O(n^2).

The extra space costs the transition space when the data is shifted, and the space complexity is O(1).

def InsertSort(lst): n=len(lst) if n<=1: return lst for i in range(1,n): j=i target=lst[i] #A number to be inserted for each cycle while j>0 and target<lst[j-1]: #Compare and move back. target give room lst[j]=lst[j-1] j=j-1 lst[j]=target #hold target Insert into the vacancy return lst x=input("Please enter the column to be sorted:\n") y=x.split() arr=[] for i in y: arr.append(int(i)) arr=InsertSort(arr) #print(arr) print("The sequence is as follows:") for i in arr: print(i,end=' ')

2.2 Shell Sort

Firstly, the whole sequence of records to be sorted is divided into several subsequences for direct insertion and sorting.

Select an incremental sequence t1, t2,... tk, where ti > tj, tk=1;

According to the incremental sequence number k, the sequence is sorted by K times.

According to the corresponding incremental ti, the columns to be sorted are divided into several sub-sequences of m length, and the sub-tables are sorted directly by insertion. When the incremental factor is only 1, the whole sequence is treated as a table, and the length of the table is the length of the whole sequence.

Hill sorting is an efficient implementation of insert sorting (you can compare the code of insert sorting and Hill sorting). Simple insert sorting is optimized to reduce the number of moves.

Simple insertion sort moves a lot of data every time it is inserted. Many of the moves before and after insertion are repetitive operations, and the efficiency of moving from one step to displacement is much higher.

If the sequence is basically ordered, simple insertion sort does not need to do many mobile operations, and it is very efficient.

Hill sort divides the sequence into several subsequences according to fixed interval, and inserts the sequence simply in the subsequence. First, long distance movement is made to make the sequence basically orderly; then the interval is reduced gradually and repeated operation is done, and finally the simple insert sort is done at 1:00 interval.

Hill sort divides the sequence into O(n) times, simple insert sort O(logn) times and time complexity O(nlogn)

The extra space overhead is a temporary storage required for data movement during insertion, and the spatial complexity O(1)

def ShellSort(lst): def shellinsert(arr,d): n=len(arr) for i in range(d,n): j=i-d temp=arr[i] #Record the number of entries and exits while(j>=0 and arr[j]>temp): #Look back and forth to find the place where the number is smaller than that. arr[j+d]=arr[j] #Move back j-=d if j!=i-d: arr[j+d]=temp n=len(lst) if n<=1: return lst d=n//2 while d>=1: shellinsert(lst,d) d=d//2 return lst x=input("Please enter the column to be sorted:\n") y=x.split() arr=[] for i in y: arr.append(int(i)) arr=ShellSort(arr) #print(arr) print("The sequence is as follows:") for i in arr: print(i,end=' ')

The core of Hill sort is the setting of interval sequence. It can not only set the interval sequence in advance, but also dynamically define the interval sequence. The algorithm for dynamically defining interval sequences was proposed by Robert Sedgewick, co-author of Algorithms (4th edition).

3. Selective Sorting

3.1 Select Sort

Initial state: the disordered region is R[1.n], and the ordered region is empty.

The order of the first trip (i=1,2,3... At the beginning of n-1, the current ordered region and disordered region are R [1.i-1] and R (i.n.) respectively. The row sorting selects the record R[k] with the smallest keyword from the current disordered region, and exchanges it with the first record R of the disordered region, so that R[1.i] and R[i+1.n) can be changed into a new ordered region and a new disordered region with an increase in the number of records and a decrease in the number of records, respectively.

At the end of n-1, the array is ordered.

Simple selection sort also operates on n-1 rounds of data, finding a maximum (minimum) value in each round.

Operations refer to the selection, i.e. the unsorted number is compared and exchanged one by one to compete for the maximum position, and the number on an unsorted position is exchanged into the sorted number in each round, i.e. one maximum value is selected in each round.

O(n) operations per round, a total of O(n) rounds, time complexity O(n^2).

The extra space costs the transition space when exchanging data, and the space complexity is O(1).

def SelectSort(lst): n=len(lst) if n<=1: return lst for i in range(0,n-1): minIndex=i for j in range(i+1,n): #Compare once, record index is not exchanged if lst[j]<lst[minIndex]: minIndex=j if minIndex!=i: #Exchange by index (lst[minIndex],lst[i])=(lst[i],lst[minIndex]) return lst x=input("Please enter the column to be sorted:\n") y=x.split() arr=[] for i in y: arr.append(int(i)) arr=SelectSort(arr) #print(arr) print("The sequence is as follows:") for i in arr: print(i,end=' ')

One of the most stable sorting algorithms is O(n2) time complexity, so the smaller the data size, the better. The only advantage may be that it doesn't take up extra memory space. In theory, the choice of sorting is probably the most common sorting method that ordinary people think of.

Heap Sort

Heapsort is a sort algorithm designed by utilizing the data structure of heap. The heap is a nearly complete binary tree structure, and it satisfies the heap property at the same time: that is, the key value or index of the child node is always less than (or greater than) its parent node.

The initial sequence of keywords to be sorted (R1,R2... Rn) is constructed into a large top heap, which is the initial disordered region.

By exchanging the top element R [1] with the last element R [n], a new disordered region (R1,R2,...) is obtained. Rn-1 and new ordered region (Rn) satisfy R[1,2... N-1] <=R[n];

Because the new top R[1] after switching may violate the nature of the heap, it is necessary to deal with the current disordered region (R1,R2,...). Rn-1 is adjusted to a new heap, and then R[1] is exchanged with the last element of the disordered region to get a new disordered region (R1,R2.. Rn-2) and new ordered regions (Rn-1,Rn). Repeat this process until the number of elements in the ordered region is n-1, then the whole ordering process is completed.

The initial build-up process of heap sorting is very complex. The O(n) level non-leaf nodes are treated as O(logn), time complexity O(nlogn), and then each heap adjustment operation determines the order of a number, time complexity O(nlogn). Combined Time Complexity O(nlogn)

The extra space overhead is a temporary space when the heap process is adjusted and the root node is swapped down. The space complexity O(1)

def HeapSort(lst): def heapadjust(arr,start,end): #Will take start Adjust the heap for the root node to the large top heap temp=arr[start] son=2*start+1 while son<=end: if son<end and arr[son]<arr[son+1]: #Find out if the left and right children have larger nodes son+=1 if temp>=arr[son]: #Judgment of Big Top Reactor break arr[start]=arr[son] #Subnode Upward Moving start=son #Continue to compare downwards son=2*son+1 arr[start]=temp #Insert the top of the original reactor into the correct position ####### n=len(lst) if n<=1: return lst #Establishment of Big Top Reactor root=n//2-1 #The last non-leaf node (in a complete binary tree) while(root>=0): heapadjust(ls,root,n-1) root-=1 #Pinch off the top of the heap and adjust the heap i=n-1 while(i>=0): (lst[0],lst[i])=(lst[i],lst[0]) #Put the top of the big heap at the end heapadjust(lst,0,i-1) #Adjust the heap of residues i-=1 return lst ######### x=input("Please enter the column to be sorted:\n") y=x.split() arr=[] for i in y: arr.append(int(i)) arr=HeapSort(arr) #print(arr) print("The sequence is as follows:") for i in arr: print(i,end=' ')

4. Merge Sort

4.1 Two-way Merge Sort

Merge sorting is an effective sorting algorithm based on merge operation. This algorithm is a very typical application of Divide and Conquer. By merging the existing ordered subsequences, a completely ordered sequence can be obtained, that is, each subsequence is ordered first, and then the subsequence segments are ordered. If two ordered tables are merged into one ordered table, it is called 2-path merging.

The input sequence with length n is divided into two sub-sequences with length n/2.

The two subsequences are sorted by merging.

The two sorted subsequences are merged into a final sorted sequence.

def MergeSort(lst): #Merging left and right subsequence functions def merge(arr,left,mid,right): temp=[] #Intermediate array i=left #Left Sequence Initiation j=mid+1 #Right Segment Subsequence Initiation while i<=mid and j<=right: if arr[i]<=arr[j]: temp.append(arr[i]) i+=1 else: temp.append(arr[j]) j+=1 while i<=mid: temp.append(arr[i]) i+=1 while j<=right: temp.append(arr[j]) j+=1 for i in range(left,right+1): # !Notice here, not directly. arr=temp,They are not necessarily the same size. arr[i]=temp[i-left] #Recursive call merge sort def mSort(arr,left,right): if left>=right: return mid=(left+right)//2 mSort(arr,left,mid) mSort(arr,mid+1,right) merge(arr,left,mid,right) n=len(lst) if n<=1: return lst mSort(lst,0,n-1) return lst x=input("Please enter the column to be sorted:\n") y=x.split() arr=[] for i in y: arr.append(int(i)) arr=MergeSort(arr) #print(arr) print("The sequence is as follows:") for i in arr: print(i,end=' ')

Merge sort is a stable sort method. Like selection sort, merge sort performance is not affected by input data, but it performs much better than selection sort because it is always O(nlogn) time complexity. The cost is additional memory space.

5. Linear time non-comparative sort

Counting Sort

Find the largest and smallest elements in the array to be sorted.

The number of occurrences of each element whose value is i in the array is counted and stored in item i of the array C.

Accumulate all counts (starting with the first element in C, adding each item to the previous one);

Backfill the target array: Place each element I in item C(i) of the new array, and subtract C(i) by 1 for each element placed.

Count sorting uses the number to be sorted as the subscript of the count array (list), counts the number of each value, and then outputs it in turn.

The size of the count array depends on the range of values of the data to be arranged, so there are certain requirements for the data, otherwise the space overhead can not be sustained.

The counting sort only needs to traverse the data once, record in the counting array, and output the counting array has the subscript of the record. The time complexity is O(n+k).

The extra space overhead refers to the count array, which is actually classified into k categories according to the data value (size depends on the value of the data), and space complexity O(k).

def CountSort(lst): n=len(lst) num=max(lst) count=[0]*(num+1) for i in range(0,n): count[lst[i]]+=1 arr=[] for i in range(0,num+1): for j in range(0,count[i]): arr.append(i) return arr x=input("Please enter the column to be sorted:\n") y=x.split() arr=[] for i in y: arr.append(int(i)) arr=CountSort(arr) #print(arr) print("The sequence is as follows:") for i in arr: print(i,end=' ')

Counting sorting is a stable sorting algorithm. When the input elements are integers between N zeros and k, the time complexity is O(n+k), and the space complexity is O(n+k), and the sorting speed is faster than any comparative sorting algorithm. When K is not very large and the sequence is relatively centralized, counting sort is a very effective sort algorithm.

5.2 Bucket Sort

Set up a quantitative array as an empty bucket.

Traverse the input data, and put the data one by one into the corresponding bucket;

Sort each bucket that is not empty;

Segmentation of sequenced data from non-empty buckets.

Bucket sorting is actually a generalization of counting sorting, but its implementation is much more complicated.

Bucket sorting first divides the data into different ordered regions (buckets) with a certain functional relationship, then the sub-data are sorted in buckets, and then output sequentially.

When each different data is allocated a bucket, it is equivalent to counting sort.

Assuming that N data are divided into k buckets with fast sorting and time complexity of O(n)+O(k*n/k*log(n/k)=O(n)+O(n*(log(n)-log(k)),

Obviously, the larger k, the closer the time complexity is to O(n), of course, the greater the space complexity O(n+k), which is the balance between space and time.

def BucketSort(lst): ##############Quick sort in bucket def QuickSort(lst): def partition(arr,left,right): key=left #Dividing Reference Index,Default to the first number, which can be optimized while left<right: while left<right and arr[right]>=arr[key]: right-=1 while left<right and arr[left]<=arr[key]: left+=1 (arr[left],arr[right])=(arr[right],arr[left]) (arr[left],arr[key])=(arr[key],arr[left]) return left def quicksort(arr,left,right): #Recursive call if left>=right: return mid=partition(arr,left,right) quicksort(arr,left,mid-1) quicksort(arr,mid+1,right) #Principal function n=len(lst) if n<=1: return lst quicksort(lst,0,n-1) return lst ###################### n=len(lst) big=max(lst) num=big//10+1 bucket=[] buckets=[[] for i in range(0,num)] for i in lst: buckets[i//10].append(i) #Dividing barrel for i in buckets: #Bucket sorting bucket=QuickSort(i) arr=[] for i in buckets: if isinstance(i, list): for j in i: arr.append(j) else: arr.append(i) for i in range(0,n): lst[i]=arr[i] return lst x=input("Please enter the column to be sorted:\n") y=x.split() arr=[] for i in y: arr.append(int(i)) arr=BucketSort(arr) #print(arr) print("The sequence is as follows:") for i in arr: print(i,end=' ')

The linear time O(n) is best used for bucket sorting. The time complexity of bucket sorting depends on the time complexity of sorting data between buckets, because the time complexity of other parts is O(n). Obviously, the smaller the bucket partition, the less data between buckets, and the less time it takes to sort. But the corresponding space consumption will increase.

5.3 Radix Sort

Gets the maximum number in the array and the number of digits.

arr is the original array, and each bit is taken from the lowest bit to form a radix array.

The radix is sorted by counting (using the characteristics of counting sorting suitable for small range numbers);

import math def RadixSort(lst): def getbit(x,i): #Return x The first i Bit (from right to left, bit 0) value y=x//pow(10,i) z=y%10 return z def CountSort(lst): n=len(lst) num=max(lst) count=[0]*(num+1) for i in range(0,n): count[lst[i]]+=1 arr=[] for i in range(0,num+1): for j in range(0,count[i]): arr.append(i) return arr Max=max(lst) for k in range(0,int(math.log10(Max))+1): #Yes k Bit Arrangement k second,Arrange by one place at a time arr=[[] for i in range(0,10)] for i in lst: #take ls(Each number in the sequence to be arranged is classified by a bit (0).-9 Total 10 categories) arr[][]In a two-dimensional array (list) arr[getbit(i,k)].append(i) for i in range(0,10): #Yes arr[]Each of the categories (a list) is sorted by count. if len(arr[i])>0: arr[i]=CountSort(arr[i]) j=9 n=len(lst) for i in range(0,n): #Sequential output arr[][]Median to ls In Chinese, that is to say, according to k Place well while len(arr[j])==0: j-=1 else: ls[n-1-i]=arr[j].pop() return lst x=input("Please enter the column to be sorted:\n") y=x.split() arr=[] for i in y: arr.append(int(i)) arr=RadixSort(arr) #print(arr) print("The sequence is as follows:") for i in arr: print(i,end=' ')

The cardinal sorting is based on sorting separately and collecting separately, so it is stable. However, the performance of cardinal sorting is slightly worse than that of barrel sorting. Every barrel allocation of keywords requires O(n) time complexity, and after allocation, the time complexity of O(n) is needed to obtain a new keyword sequence. If the data to be arranged can be divided into D keywords, the time complexity of cardinal sorting will be O(d*2n), of course, D is much smaller than n, so it is basically linear.

The spatial complexity of cardinal sorting is O(n+k), where k is the number of buckets. Generally speaking, n > k, so the extra space needs about n.

done.

Reference article: https://blog.csdn.net/weixin_41571493/article/details/81875088