Analysis of the principle of HashMap in Java

Author：Eve Cole Update Time：2025-05-03 20:48:02

Let's first look at an interview question:

A series of key-value pairs stored in a HashMap, where the key is a certain type we customize. After putting in HashMap, we change the attributes of a key externally, and then we use this key to remove elements from the HashMap. What will HashMap return at this time?

Sample code and answers have been given in the article, but no explanation is given about the principle of HashMap.

1. Features

We can use any class as the HashMap key, but what are the restrictions on these classes? Let's look at the following code:

 public class Person { private String name; public Person(String name) { this.name = name; }}Map<Person, String> testMap = new HashMap<>();testMap.put(new Person("hello"), "world");testMap.get(new Person("hello")); // ---> null

I originally wanted to get the value of the Person class with equal field values, but the result was null. Anyone who has a little knowledge of HashMap can see that the Person class does not have an override hashcode method, which leads to its inheritance of the Object hashcode (returning is its memory address). This is also the reason why invariant classes such as String (or Integer, etc.) are commonly used as the key of HashMap. So, how does HashMap use hashcode to quickly index the key?

2. Principle

First, let’s take a look at a simple HashMap implementation solution in "Thinking in Java":

 //: containers/SimpleHashMap.java// A demonstration hashed Map.import java.util.*;import net.mindview.util.*;public class SimpleHashMap<K,V> extends AbstractMap<K,V> { // Choose a prime number for the hash table size, to achieve a uniform distribution: static final int SIZE = 997; // You can't have a physical array of generics, but you can upcast to one: @SuppressWarnings("unchecked") LinkedList<MapEntry<K,V>>[] buckets = new LinkedList[SIZE]; public V put(K key, V value) { V oldValue = null; int index = Math.abs(key.hashCode()) % SIZE; if(buckets[index] == null) buckets[index] = new LinkedList<MapEntry<K,V>>(); LinkedList<MapEntry<K,V>> bucket = buckets[index]; MapEntry<K,V> pair = new MapEntry<K,V>(key, value); boolean found = false; ListIterator<MapEntry<K,V>> it = bucket.listIterator(); while(it.hasNext()) { MapEntry<K,V> iPair = it.next(); if(iPair.getKey().equals(key)) { oldValue = iPair.getValue(); it.set(pair); // Replace old with new found = true; break; } } if(!found) buckets[index].add(pair); return oldValue; } public V get(Object key) { int index = Math.abs(key.hashCode()) % SIZE; if(buckets[index] == null) return null; for(MapEntry<K,V> iPair : buckets[index]) if(iPair.getKey().equals(key)) return iPair.getValue(); return null; } public Set<Map.Entry<K,V>> entrySet() { Set<Map.Entry<K,V>> set= new HashSet<Map.Entry<K,V>>(); for(LinkedList<MapEntry<K,V>> bucket : buckets) { if(bucket == null) continue; for(MapEntry<K,V> mpair : bucket) set.add(mpair); } return set; } public static void main(String[] args) { SimpleHashMap<String,String> m = new SimpleHashMap<String,String>(); m.putAll(Countries.capitals(25)); System.out.println(m); System.out.println(m.get("ERITREA")); System.out.println(m.entrySet()); }}

SimpleHashMap constructs a hash table to store keys. The hash function is the modulus operation Math.abs(key.hashCode()) %SIZE, and the hash conflict is resolved by using the linked list method; each slot of buckets stores Map.Entry with the same (hash) index value, as shown in the figure below:

The implementation principle of JDK's HashMap is similar to that, which uses the hash table of the chain address to store Map.Entry:

 /** * The table, resized as necessary. Length MUST Always be a power of two. */transient Entry<K,V>[] table = (Entry<K,V>[]) EMPTY_TABLE;static class Entry<K,V> implements Map.Entry<K,V> { final K key; V value; Entry<K,V> next; int hash; …}

The index of Map.Entry is obtained by hashing the hashcode of the key. When you want to get the value corresponding to the key, calculate its index for the key, and then take out the Map.Entry in the table to get it. For details, see the code:

 public V get(Object key) { if (key == null) return getForNullKey(); Entry<K,V> entry = getEntry(key); return null == entry ? null : entry.getValue();}final Entry<K,V> getEntry(Object key) { if (size == 0) { return null; } int hash = (key == null) ? 0 : hash(key); for (Entry<K,V> e = table[indexFor(hash, table.length)]; e != null; e = e.next) { Object k; if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k)))) return e; } return null;}

It can be seen that hashcode directly affects the efficiency of HashMap's hash function - a good hashcode will greatly reduce hash conflicts and improve query performance. At the same time, this also explains the two questions raised at the beginning: if a custom class does HashMap key, the hashcode calculation should cover all fields of the constructor, otherwise it is possible to get null.