1. Basic knowledge
What we usually call a heap refers to a binary heap, which is also called a complete binary tree or an approximately complete binary tree. The binary heap is divided into the largest heap and the smallest heap.
Heap sorting refers to a sorting algorithm designed using the heap data structure, which is a kind of sorting selection. You can quickly locate elements of the specified index using the characteristics of the array. Arrays can directly obtain elements based on the index, and the time complexity is O(1), that is, constants, so they are extremely efficient for value acquisition.
The characteristics of the maximum heap are as follows:
The characteristics of the minimum heap are as follows:
2. Algorithm Thought
1. The algorithm idea of the largest heap is:
First, the initial R[0…n-1] is built into the largest heap. At this time, it is an unordered heap. The heap top is the largest element and then the last record R[n-1] of the unordered area is exchanged. This results in the new unordered area R[0…n-2] and the ordered area R[n-1], and satisfy R[0…n-2].keys ≤ R[n-1].key
Since the first R[0…n-2] may not satisfy the properties of the maximum heap after exchange, the first R[0…n-2] is adjusted to the maximum heap until only the last element of R[0] is adjusted.
After the maximum heap sort is completed, it is actually an ascending sequence. Each time the heap is adjusted, the largest element is obtained and then exchanged with the last element of the current heap. Therefore, the final sequence obtained is an ascending sequence.
2. The algorithm idea of the smallest heap is:
First, the initial R[0…n-1] is built into the smallest heap. At this time, it is an unordered heap. The top element of the heap is the smallest element and then exchanges the top R[0] with the last R[n-1] of the unordered area, thereby obtaining the new unordered heap R[0…n-2] and the ordered heap R[n-1], and satisfying R[0…n-2].keys >= R[n-1].key
Since the first R[0…n-2] may not meet the properties of the minimum heap after exchange, the first R[0…n-2] is adjusted to the minimum heap until only the last element of R[0] is adjusted and the minimum heap is sorted. After the ordering of the minimum heap is completed, it is actually a descending sequence. Each time the heap is adjusted, the smallest element is obtained and then exchanged with the last element of the current unordered heap, so the obtained sequence is in descending order.
Tip: The process of heap sorting is actually the process of continuously expanding the ordered area, and then continuously reducing the disordered area until there are only ordered areas.
3. Sorting process analysis
Because the algorithm is relatively abstract, here we directly illustrate the process of heap sorting by giving a small example. Next, we use this unordered sequence to use the largest heap for heap sorting, and the resulting sequence is the ascending sequence (ASC).
Unordered sequence: 89,-7,999,-89,7,0,-888,7,-7
Step 1: Initialize the maximum heap to build:
Step 2: Exchange the maximum element 999 on the top of the heap with the last element of the unordered area, so that 999 becomes an ordered area. After exchange, -7 becomes the heap top. Since -7 is not the largest element in the unordered area, it is necessary to adjust the unordered area so that the maximum value 89 in the unordered area becomes the heap top, so -7 and 89 are exchanged. After the exchange, the right subtree of 89 does not meet the properties of the largest heap, so the right subtree must be adjusted to the largest heap, so -7 must be exchanged with 0, as shown in the figure below:
From the figure, when -7% 89% swap, the top of the pile is the largest element, but the left child of -7 is 0 and the right child is -888. Since -7<0, the node -7 does not meet the properties of the heap, so it needs to be adjusted. So, 0 is exchanged with -7.
Then repeat the second step until it becomes an ordered area.
Finally: What is obtained is an ascending sequence
4. Time complexity
The time of heap sorting is mainly composed of the time overhead of establishing the initial heap and repeatedly adjusting the heap. Since the heap sorting is unstable, the time complexity it gets will be greater according to the actual situation, so it can only take the average time complexity.
The average time complexity is: O( N * log2(N) )
The time-consuming operations of heap sorting include: initial heap + repeated adjustment of the heap, and the time complexity is as follows:
1. Initial heap building: Each parent node will compare and exchange with the left and right child nodes for up to 2 times, so the complexity is related to the number of parent nodes. Based on 2x <= n (x is the number of times n elements can be folded in half, that is, the number of parent nodes), it is obtained x = log2n. That is, O ( log2n )
2. Repeated adjustment of the heap: Since the array comparison results are recorded during the initialization of the heap, the heap sort is not sensitive to the order of the array of the original sequence, and the best situation is similar to the worst case. The heap top element needs to be extracted n-1 times. Each time the heap top element is taken, the heap needs to be rebuilt (O(reconstruct heap) < O(initial heap)). So less than O(n-1) * O(log2n)
Recommended usage:
Since the number of times the initialization of the heap needs to be compared, heap sorting is more suitable for situations where the data volume is very large (million data or more). Since efficient quick sorting is based on recursive implementation, a stack overflow error occurs when the data volume is very large.
5. Java sample code
public class HeapSort{ private static int[] sort=new int[]{1,0,10,20,3,5,6,4,9,8,12, 17,34,11}; public static void main(String[] args){ buildMaxHeapify(sort); heapSort(sort); print(sort); } private static void buildMaxHeapify(int[] data){//Only those without children need to create the maximum heap, start from the last parent node int startIndex=getParentIndex(data.length-1);//Create the maximum heap from the end, and it is the correct heap for(int i=startIndex;i>=0;i--){ maxHeapify(data,data.length,i); } } /** *Create the maximum heap* *@paramdata *@paramheapSize requires the size of the maximum heap, which is generally used in sort, because the maximum value is placed at the end, the end is no longer classified as the maximum heap*@paramindex The position where the maximum heap is currently needed*/ private static void maxHeapify(int[] data,int heapSize,int index){//Compare the current point with the left and right child nodes int left=getChildLeftIndex(index); int right=getChildRightIndex(index); int largest=index; if(left<heapSize&&data[index]<data[left]){ largest=left; } if(right<heapSize&&data[largest]<data[right]){ largest=right; }//After getting the maximum value, it may need to be exchanged. If exchanged, its children may not be the largest heap. It needs to be readjusted if(largest!=index){ int temp=data[index]; data[index]=data[largest]; data[largest]=temp; maxHeapify(data,heapSize,largest); } } /** *Sorting, the maximum value is placed at the end. Although data is the largest heap, it becomes incremental after sorting* *@paramdata */ private static void heapSort(int[] data){//Exchange with the header at the end, adjust the maximum heap after exchange for(int i=data.length-1;i>0;i--){ int temp=data[0]; data[0]=data[i]; data[i]=temp; maxHeapify(data,i,0); } } /** *Paramcurrent *@return */ private static int getParentIndex(int current){ return(current-1)>>1; } /** *Present child node position pay attention to parentheses, and the addition priority is higher* *@paramcurrent *@return */ private static int getChildLeftIndex(int current){ return(current<<1)+1; } /** *Right child node position * *@paramcurrent *@return */ private static int getChildRightIndex(int current){ return(current<<1)+2; } private static void print(int[] data){ int pre=-2; for(int i=0;i<data.length;i++){ if(pre<(int)getLog(i+1)){ pre=(int)getLog(i+1); System.out.println(); } System.out.print(data[i]+"|"); } } /** *Logo with base 2* *@paraparam *@return */ private static double getLog(double param){ return Math.log(param)/Math.log(2); }}