Analysis of the implementation principle of Java8 HashMap

Author：Eve Cole Update Time：2025-05-04 06:32:02

Preface: There are a lot of new things added after Java 8. I found some relevant information online. After HashMap was abused, I decided to sort out the relevant knowledge for myself. This article for reference in the picture and some content: http://www.VeVB.COM/article/80446.htm

The storage structure of HashMap is shown in the figure: if there are more than 8 nodes on a bucket, the storage structure is a red and black tree, and less than 8 is a one-way linked list.

1: Some properties of HashMap

 public class HashMap<k,v> extends AbstractMap<k,v> implements Map<k,v>, Cloneable, Serializable {private static final long serialVersionUID = 362498820763181265L;// The default initial capacity is 16static final int DEFAULT_INITIAL_CAPACITY = 1 << 4;// Maximum capacity static final int MAXIMUM_CAPACITY = 1 << 30;// The default fill factor (the previous versions were also called load factor) static final float DEFAULT_LOAD_FACTOR = 0.75f;// This is a threshold. When the number of linked lists on the bucket is greater than this value, it will be converted into a red and black tree. The code of the put method uses static final int TREEIFY_THRESHOLD = 8;// It is also the threshold. When the number of linked lists on the bucket is less than this value, the tree is converted to a linked list static final int UNTREEIFY_THRESHOLD = 6;// It is said in the source code comment that it is: the minimum capacity of the tree is at least 4 x TREEIFY_THRESHOLD = 32. Then to avoid (resizing and treeification thresholds) Set to 64static final int MIN_TREEIFY_CAPACITY = 64;// The array of elements is stored, always a multiple of 2 transient Node<k,v>[] table;transient Set<map.entry<k,v>> entrySet;// The number of elements stored, note that this is not equal to the length of the array. transient int size;// The counter for each expansion and change of map structure is transient int modCount;// The critical value is expanded when the actual size (capacity * fill factor) exceeds the critical value, the capacity is expanded int threshold;// The fill factor final float loadFactor;

2: HashMap construction method

 // Constructor for specifying initial capacity and fill factor public HashMap(int initialCapacity, float loadFactor) {// The specified initial capacity is non-negative if (initialCapacity < 0)throw new IllegalArgumentException(Illegal initial capacity: +initialCapacity);// If the specified initial capacity is greater than the maximum capacity, set it to the maximum capacity if (initialCapacity > MAXIMUM_CAPACITY)initialCapacity = MAXIMUM_CAPACITY;// The fill ratio is positive if (loadFactor <= 0 || Float.isNaN(loadFactor))throw new IllegalArgumentException(Illegal load factor: +loadFactor); this.loadFactor = loadFactor;// After specifying the capacity, the tableSizeFor method calculates the critical value. If the value is exceeded when putting the data, it will expand. The value is definitely a multiple of 2.// The specified initial capacity has not been saved, and it is only used to generate a critical value this.threshold = tableSizeFor(initialCapacity);}// This method ensures that it always returns a value greater than cap and a multiple of 2. For example, passing in 999 returns 1024static final int tableSizeFor(int cap) {int n = cap - 1;// Unsigned displacement to the right n |= n >>> 1;n |= n >>> 2;n |= n >>> 4;n |= n >>> 8;n |= n >>> 16;// Nested return of trigonometric operators (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;}//Constructor 2public HashMap(int initialCapacity) {this(initialCapacity, DEFAULT_LOAD_FACTOR);}//Constructor 3public HashMap() {this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted}

3: Determine the position of the element in the array when getting and put

 static final int hash(Object key) {int h;return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);}

To determine the location

The first step: The first thing is to calculate the hash code of the key, which is an int type number. The following h >>> 16 source code comments say: in order to avoid hash collisions, the high position is spread to the low position, which is made after taking into account various factors such as speed and performance.

Step 2: h is the hash code, length is the length of the Node[] array above, do the same operation h & (length-1). Since length is a multiple of 2 -1, its binary code is 1 and the result of 1 with other numbers on it may be 0 or 1, so as to ensure uniformity after the operation. That is, the hash method ensures the uniformity of the results, which is very important and will greatly affect the put and get performance of HashMap. See the following picture for comparison:

Figure 3.1 is asymmetric hash results

Figure 3.2 is the balanced hash result

There is not much data in these two graphs. If the linked list is longer than 8, it will be converted into a red and black tree. It would be more obvious at that time. jdk8 has always been a linked list before. The complexity of linked list query is O(n) and the complexity of red and black trees due to their own characteristics is O(log(n)). If the hash results are uneven, it will greatly affect the complexity of the operation. There is a <a href="http://blog.chinaunix.net/uid-26575352-id-3061918.html">Red and Black Tree Basic Knowledge Blog</a> There is another example online to verify: a custom object is used to make keys, and adjust the hashCode() method to see that put is worth the time.

 public class MutableKeyTest {public static void main(String args[]){class MyKey {Integer i;public void setI(Integer i) {this.i = i;}public MyKey(Integer i) {this.i = i;}@Overridepublic int hashCode() {// If 1// return 1return i;}// object is stored as a key map, the equals method must be implemented @Overridepublic boolean equals(Object obj) {if (obj instanceof MyKey) {return i.equals((((MyKey)obj).i);} else {return false;}}}// My machine configuration is not high. If 25000 is normal for 27ms, you can try 25 million. If hashCode() method returns 1, 2.5 million will be stuck Map<MyKey,String> map = new HashMap<>(25000,1);Date begin = new Date();for (int i = 0; i < 20000; i++){map.put(new MyKey(i), "test " + i);}Date end = new Date();System.out.println("Time(ms) " + (end.getTime() - begin.getTime()));

4: get method

 public V get(Object key) {Node<k,v> e;return (e = getNode(hash(key), key)) == null ? null : e.value;}final Node<k,v> getNode(int hash, Object key) {Node<k,v>[] tab; Node<k,v> first, e; int n; K k;// hash & (length-1) Get the root position of the red and black tree or the header of the linked list if ((tab = table) != null && (n = tab.length) > 0 &&(first = tab[(n - 1) & hash]) != null) {if (first.hash == hash && // always check first node((k = first.key) == key || (key != null && key.equals(k)))) return first;if ((e = first.next) != null) {// If it is a tree, the complexity of traversing the red and black tree is O(log(n)) to get the node value if (first instanceof TreeNode)return ((TreeNode<k,v>)first).getTreeNode(hash, key);// else is the linked list structure do {if (e.hash == hash &&((k = e.key) == key || (key != null && key.equals(k))))return e;} while ((e = e.next) != null);}}return null;}

5: put method, when put, locate the bucket according to h & (length 1) and then see if it is a red and black tree or a linked list and putVal

 public V put(K key, V value) {return putVal(hash(key), key, value, false, true);}final V putVal(int hash, K key, V value, boolean onlyIfAbsent,boolean evict) {Node<k,v>[] tab; Node<k,v> p; int n, i;// If the tab is empty or the length is 0, memory allocated resize()if ((tab = table) == null || (n = tab.length) == 0)n = (tab = resize()).length;// (n - 1) & hash finds the put position. If it is empty, putif directly ((p = tab[i = (n - 1) & hash]) == null)tab[i] = newNode(hash, key, value, null);else {Node<k,v> e; K k;// The hash value of the first node is the same, and the key value is the same as the insert key if (p.hash == hash &&((k = p.key) == key || (key != null && key.equals(k))))e = p;else if (p instanceof TreeNode)// The put method of a red and black tree is relatively complicated. After putVal, you have to traverse the entire tree. When necessary, modify the value to ensure the characteristics of the red and black tree e = ((TreeNode<k,v>)p).putTreeVal(this, tab, hash, key, value);else {// Linked list for (int binCount = 0; ; ++binCount) {if ((e = p.next) == null) {// e is empty, indicating that the same node with the same key value has been reached at the end of the table, then create a new node p.next = newNode(hash, key, value, null);// After adding a new node, if the number of nodes reaches the threshold, convert the linked list to a red and black tree if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1sttreeifyBin(tab, hash);break;}// Allow empty key to empty valueif (e.hash == hash &&((k = e.key) == key || (key != null && key.equals(k))))break;p = e;}}// Update node value with the same hash value and key value if (e != null) { // existing mapping for keyV oldValue = e.value;if (!onlyIfAbsent || oldValue == null)e.value = value;afterNodeAccess(e);return oldValue;}}++modCount;if (++size > threshold)resize();afterNodeInsertion(evict);return null;}

6: Resize method

 final Node<K,V>[] resize() {Node<K,V>[] oldTab = table;int oldCap = (oldTab == null) ? 0 : oldTab.length;int oldThr = threshold;int newCap, newThr = 0;if (oldCap > 0) {if (oldCap >= MAXIMUM_CAPACITY) {threshold = Integer.MAX_VALUE;return oldTab;}// This sentence is more important, it can be seen that each expansion is 2 times smaller if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&oldCap >= DEFAULT_INITIAL_CAPACITY)newThr = oldThr << 1; // double threshold}else if (oldThr > 0) // initial capacity was placed in thresholdnewCap = oldThr;else { // zero initial threshold signifies using defaultsnewCap = DEFAULT_INITIAL_CAPACITY;newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);}if (newThr == 0) {float ft = (float)newCap * loadFactor;newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?(int)ft : Integer.MAX_VALUE);}threshold = newThr;@SuppressWarnings({"rawtypes","unchecked"})Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];table = newTab;if (oldTab != null) {for (int j = 0; j < oldCap; ++j) {Node<K,V> e;if ((e = oldTab[j]) != null) {oldTab[j] = null;if (e.next == null)newTab[e.hash & (newCap - 1)] = e;else if (e instanceof TreeNode)((TreeNode<K,V>)e).split(this, newTab, j, oldCap);else { // preserve orderNode<K,V> loHead = null, loTail = null;Node<K,V> hiHead = null, hiTail = null;Node<K,V> next;do {next = e.next;if ((e.hash & oldCap) == 0) {if (loTail == null)loHead = e;elseloTail.next = e;loTail = e;}else {if (hiTail == null)hiHead = e;elsehiTail.next = e;hiTail = e;}} while ((e = next) != null);if (loTail != null) {loTail.next = null;newTab[j] = loHead;}if (hiTail != null) {hiTail.next = null;newTab[j + oldCap] = hiHead;}}}}} return newTab;}

The above is the relevant knowledge about the implementation principle analysis of Java8 HashMap introduced to you by the editor. I hope it will be helpful to you!