Implementation of fast sorting algorithms and optimization of related algorithms in Java programming

Author：Eve Cole Update Time：2025-06-07 03:16:01

Time complexity

Average: O(nlgn)
Worst case: O(n*n), occurs when the data is already in the sorting state.

1. Select a value a[i] from the data as a reference
2. Taking a[i] as a reference, divide the data into 2 parts: P1 and P2. All the data in P1≤a[i], all the data in P2 >a[i], and the data becomes {{P1}{a[i]}{P2}}
3. Repeat the above steps with P1 and P2 until only 1 data is left in each part.
4. Data is sorted in ascending order

Basic example:

Raw data:

 {3, 9, 8, 5, 2, 1, 6}

Step 1: Select the first data: 3
Step 2: Divide the data into 2 parts, the left side is ≤3, and the right side is greater than >3:

 {2,1} {3} {9,8,5,6}

Step 3: Repeat the above steps for each part until there is only 1 data left in each part:

 {2,1} => {1} {2} {9, 8, 5, 6} => {8, 5, 6} {9}=> {5, 6} {8} {9}=> {5} {6} {8} {9}

Step 4: The data is sorted in ascending order:

 {1} {2} {3} {5} {6} {8} {9}

The data in the program is usually stored in an array. Taking an array of type int as an example, you can write the above steps into a quickSort function prototype:

 quickSort(int begin, int end) { //begin is the index value of the first data of the array, end is the index value of the last data of the array +1 //If there is only 1 data or 0 data, the program returns if( begin == end || begin == (end-1) ) return; int p = in[begin];//p is the selected reference data, select the first data int a = begin +1; //a as the index value of the 2-part data dividing line int b = a;//b is the index value of the data to be compared for( ; b < end; b++) {//Compare the data in the array with the reference data in sequence if( in[b] < p) { //If the data < reference data, move it to the left if(a == b){a++; continue;} //If the data is already on the left, it will not move int temp = in[a];//Move the data to the left in[a] = in[b]; in[b] = temp; a++;//Move the dividing line right} } in[begin] = in[a-1];//Whike the reference value to the middle of 2 sets of data in[a-1] = p; if( a-1 > begin){ //If there is data on the left, repeat the above steps quickSort(begin, a); } if( end-1 > a ) {//If there is data on the right, repeat the above steps quickSort(a, end); } return; //If there is no data}

Algorithm optimization
The above quick sort algorithm can be said to be the most basic quick sort because it does not take into account any input data. However, it is easy to find the flaws of this algorithm: this is when our input data is basically ordered or even completely ordered, the algorithm degenerates into bubble sorting, no longer O(nn), but O(n^2).
The root cause is that in our code implementation, we only start from the first array at a time. If we use the median values of "three to get middle", that is, arr[low], arr[high], arr[(low+high)/2] as the pivot record, the performance of fast sorting in the worst case can be greatly improved. However, we still cannot improve its performance to O(n) in the array-ordered case. There are also ways to improve the time performance of fast sorting in the worst-case scenario to varying degrees.

In addition, quick sorting requires a recursive stack, which is usually not very deep and is at the log(n) level. However, if the length of the two arrays divided at a time is severely unbalanced, in the worst case, the depth of the stack will increase to O(n). At this time, the spatial complexity brought by the stack space cannot be ignored. If the overhead of additional variables is added, it may even reach a terrifying O(n^2) space complexity here. Therefore, the worst spatial complexity of quick sorting is not a fixed value, and may not even be at the same level.
To solve this problem, we can compare the lengths of the ends after each division and sort the short sequences first (the purpose is to end these stacks first to free up space), which can reduce the maximum depth back to the O(n) level.

Here are three optimization ideas for quick sorting:
For small arrays, insert sorting can be used to avoid recursive calls. For example, when (hi <= lo + M), you can go to insert sort.
Use the median of a subarray to slice the array. This results in better slicing, but at the cost of calculating the median.
If the array contains a large number of repeating elements, three-way slicing can be used. Divide the array into three parts, corresponding to array elements smaller than, equal to, and greater than slicing elements respectively. The code implementation is as follows:

 private static void sort1(int[] a, int lo, int hi) { if (hi <= lo) return; int lt = lo, i = lo + 1, gt = hi; int v = a[lo]; while (i <= gt) { if (a[i] < v) { swap(a, lt++, i++); } else if (a[i] > v) { swap(a, i, gt--); } else { i++; } sort(a, lo, lt - 1); sort(a, gt + 1, hi); }}