Java collection - Detailed explanation of equals and hashCode methods in Java

Author：Eve Cole Update Time：2025-06-01 04:48:02

The equals method and hashCode method in Java are in Object, so each object has these two methods. Sometimes we need to implement specific needs and may have to rewrite these two methods. Today, we will introduce the functions of these two methods.

The equals() and hashCode() methods are used to compare in the same class, especially when storing the same class object in the container such as set to store objects in the same class.

Here we first need to understand a problem:

Two objects with equals() equals, hashcode() must be equal, and two objects with equals() not equals, cannot prove that their hashcode() is not equal. In other words, for two objects whose equals() method is not equal, hashCode() may be equal. (My understanding is caused by the hash code conflicts when generated)

Here hashCode is like the index of each character in the dictionary, and equals() is like comparing different words under the same character in the dictionary. Just like in the dictionary, searching for the two words "self" and "spontaneous" under the word "self" in the dictionary, if equals() is used to determine the equality of the words query, it is the same word. For example, the two words compared by equals() are "self", then the values obtained by hashCode() method must be equal at this time; if equals() method compares the words "self" and "spontaneous", then the result is that you don't want to wait, but both of these words belong to the words "self" and so when searching for indexes, that is, hashCode() is the same. If equals() compares the words "self" and "they", then the results are also different, and the results obtained by hashCode() are also different at this time.

Conversely: hashcode() is different, and equals() can be introduced; hashcode() is equal, equals() may be equal or may not be equal. In the object class, the hashcode() method is a local method, which returns the address value of the object. The equals() method in the object class also compares the address values of the two objects. If equals() is equal, it means that the address values of the two objects are also equal, of course hashcode() is equal;

At the same time, the hash algorithm provides high efficiency for finding elements

If you want to find out whether an object is contained in a collection, how do you write the approximate program code?

You usually take out each element one by one to compare with the object you are looking for. When you find that the result of the equals method comparison between an element and the object you are looking for, stop searching and return positive information. Otherwise, return negative information. If there are many elements in a collection, such as 10,000 elements and do not contain the object you are looking for, it means that your program needs to take out 10,000 elements from the collection and compare one by one to get a conclusion.

Someone has invented a hash algorithm to improve the efficiency of finding elements from a set. This way, the set is divided into several storage areas. Each object can calculate a hash code, and the hash code can be grouped (calculated using different hash functions). Each group corresponds to a certain storage area. According to the hash of an object, it can determine which area the object should be stored in. HashSet uses a hash algorithm to access the set of objects. It internally uses the method of taking the remainder of a certain number n (this hash function is the easiest) to group and divide the hash codes. The Object class defines a hashCode() method to return the hash code of each Java object. When looking for an object from a HashSet collection, the Java system first calls the hashCode() method of the object to obtain the hash code table of the object. , then find the corresponding storage area based on the hashing, and finally obtain each element in the storage area and compare it with the object for equals method; in this way, you can get the conclusion without traversing all elements in the collection. It can be seen that the HashSet collection has good object retrieval performance, but the efficiency of storing objects in the HashSet collection is relatively low, because when adding an object to the HashSet collection, you must first calculate the hash code of the object and determine the storage location of the object in the collection based on this hash code. In order to ensure that the instance objects of a class can be stored normally in HashSet, the results of the two instance objects of this class must also be equal when the results compared by the equals() method are equal; that is, if the result of obj1.equals(obj2) is true, then the result of the following expression must also be true:
obj1.hashCode() == obj2.hashCode()

In other words: when we rewrite an object's equals method, we must rewrite its hashCode method. However, if we do not rewrite its hashCode method, the hashCode method in the Object object always returns the hash address of an object, and this address is never equal. So even if the equals method is rewritten at this time, there will be no specific effect, because if the hashCode method does not want to wait, it will not call the equals method for comparison, so it is meaningless.

If the hashCode() method of a class does not follow the above requirements, then when the results of the comparison between the two instance objects of this class are equal with the equals() method, they should not be stored in the set set at the same time. However, if they are stored in the HashSet set, since the return value of their hashCode() method is different (the return value of the hashCode method in the Object is always different), the second object may be put into a different area from the first object first according to the hash code calculation, so it cannot If you can compare the equals method with the first object, it may be stored in the HashSet collection. The hashCode() method in the Object class cannot meet the requirements of the object being stored in the HashSet, because its return value is calculated from the object's memory address. The hash value returned by the same object at any time during the program run is always unchanged. Therefore, as long as it is two different instance objects, even if the comparison results of their equals method are equal, the return value of their default hashCode method is different.

Let’s take a look at a specific example:

 RectObject object: package com.weijia.demo; public class RectObject { public int x; public int y; public RectObject(int x,int y){ this.x = x; this.y = y; } @Override public int hashCode(){ final int prime = 31; int result = 1; result = prime * result + x; result = prime * result + y; return result; } @Override public boolean equals(Object obj){ if(this == obj) return true; if(obj == null) return false; if(getClass() != obj.getClass()) return false; final RectObject other = (RectObject)obj; if(x != other.x){ return false; } if(y != other.y){ return false; } return true; } }

We overridden the hashCode and equals methods in the parent class Object and see that in the hashCode and equals methods, if the x and y values of the two RectObject objects are equal, their hashCode values are equal, and equals returns true;

Here is the test code:

 package com.weijia.demo; import java.util.HashSet; public class Demo { public static void main(String[] args){ HashSet<RectObject> set = new HashSet<RectObject>(); RectObject r1 = new RectObject(3,3); RectObject r2 = new RectObject(5,5); RectObject r3 = new RectObject(3,3); set.add(r1); set.add(r2); set.add(r3); set.add(r1); System.out.println("size:"+set.size()); } }

We stored four objects into the HashSet and printed the size of the set collection. What is the result?

Running result: size: 2

Why is it 2? This is very simple, because we rewrite the hashCode method of the RectObject class. As long as the x and y attribute values of the RectObject object are equal, then its hashCode values are also equal. So first compare the hashCode values. The x and y attribute values of r1 and r2 objects are not equal, so their hashCodes are different, so the r2 object can be put in, but the x and y attribute values of the r3 object are the same as the attribute values of the r1 object, so the hashCode is equal. At this time, we compare the equals method of r1 and r3, because the x and y values of the two are equal, so r1 and r3 objects are equal, so r3 cannot be put in. Also, adding a r1 in the end is not added, so there is only one two objects, r1 and r2 in the set set.

Next, we comment the hashCode method in the RectObject object, that is, we do not override the hashCode method in the Object object, and run the code:

Running result: size:3

This result is also very simple. First, judge the hashCode of r1 object and r2 object. Because the hashCode method in Object returns the conversion result of the object's local memory address, the hashCode of different instance objects is different. Similarly, because the hashCode of r3 and r1 is also unequal, but r1==r1, so in the final set, there are only three objects r1, r2, and r3, so the size is 3

Next, we comment the contents of the equals method in the RectObject object, directly return false, without commenting the hashCode method, and run the code:

Running result: size:3

This result is a bit unexpected, let's analyze it:

First, the objects of r1 and r2 compare hashCode, which are not equal, so r2 is put into set, and then look at the hashCode methods of r3 and compare r1 and r3, which are equal, and then compare their equals methods. Because the equals method always returns false, r1 and r3 are also unequal, and there is no need to mention r3 and r2. The hashCodes of their two are not equal, so r3 is put into set, and then look at r4, and compare r1 and r4 find that hashCodes are equal. When comparing the equals method, because equals returns false, r1 and r4 are not equal, the same r2 and r4 are also unequal, and r3 and r4 are also unequal, so r4 can be placed in the set set, so the result should be size:4, so why is it 3?

At this time, we need to check the source code of HashSet. Here is the source code of the add method in HashSet:

 /** * Adds the specified element to this set if it is not already present. * More formally, adds the specified element <tt>e</tt> to this set if * this set contains no element <tt>e2</tt> such that * <tt>(e==null ? e2==null : e.equals(e2))</tt>. * If this set already contains the element, the call leaves the set * unchanged and returns <tt>false</tt>. * * @param e element to be added to this set * @return <tt>true</tt> if this set did not already contain the specified * element */ public boolean add(E e) { return map.put(e, PRESENT)==null; }

Here we can see that HashSet is actually implemented based on HashMap. We are clicking on the put method of HashMap, the source code is as follows:

 /** * Associates the specified value with the specified key in this map. * If the map previously contained a mapping for the key, the old * value is replaced. * * @param key key with which the specified value is to be associated * @param value value to be associated with the specified key * @return the previous value associated with <tt>key</tt>, or * <tt>null</tt> if there was no mapping for <tt>key</tt>. * (A <tt>null</tt> return can also indicate that the map * previously associated <tt>null</tt> with <tt>key</tt>.) */ public V put(K key, V value) { if (key == null) return putForNullKey(value); int hash = hash(key); int i = indexFor(hash, table.length); for (Entry<K,V> e = table[i]; e != null; e = e.next) { Object k; if (e.hash == hash && ((k = e.key) == key || key.equals(k))) { V oldValue = e.value; e.value = value; e.recordAccess(this); return oldValue; } } modCount++; addEntry(hash, key, value, i); return null; }

Let's mainly look at the judgment conditions of if.

First, we determine whether the hashCode is equal. If it is not equal, skip it directly. If it is equal, then compare whether these two objects are equal or the equals method of these two objects. Because it is performed or operated, as long as one is true, we can explain it here. In fact, the size of the above set is 3, because the last r1 was not put in, and it was thought that r1==r1 returned true, so it was not put in. So the size of the set is 3. If we set the hashCode method to always return false, this set will be 4.

Finally, let’s take a look at the memory leak caused by hashCode: look at the code:

 package com.weijia.demo; import java.util.HashSet; public class Demo { public static void main(String[] args){ HashSet<RectObject> set = new HashSet<RectObject>(); RectObject r1 = new RectObject(3,3); RectObject r2 = new RectObject(5,5); RectObject r3 = new RectObject(3,3); set.add(r1); set.add(r2); set.add(r3); r3.y = 7; System.out.println("Size before deletion:"+set.size()); set.remove(r3); System.out.println("Size after deletion:"+set.size()); } }

Running results:

Size before deletion: 3
Deleted size: 3

Rush, I found a problem, and it was a big problem. We called remove to delete the r3 object, thinking that it had been deleted, but in fact it was not deleted. This is called memory leakage, which is an unused object, but it is still in memory. So after we operate this many times, the memory explodes. Take a look at the source code of remove:

 /** * Removes the specified element from this set if it is present. * More formally, removes an element <tt>e</tt> such that * <tt>(o==null ? e==null : o.equals(e))</tt>, * if this set contains such an element. Returns <tt>true</tt> if * this set contained the element (or equivalently, if this set * changed as a result of the call). (This set will not contain the * element once the call returns.) * * @param o object to be removed from this set, if present * @return <tt>true</tt> if the set contained the specified element */ public boolean remove(Object o) { return map.remove(o)==PRESENT; }

Then look at the source code of the remove method:

 /** * Removes the mapping for the specified key from this map if present. * * @param key key whose mapping is to be removed from the map * @return the previous value associated with <tt>key</tt>, or * <tt>null</tt> if there was no mapping for <tt>key</tt>. * (A <tt>null</tt> return can also indicate that the map * previously associated <tt>null</tt> with <tt>key</tt>.) */ public V remove(Object key) { Entry<K,V> e = removeEntryForKey(key); return (e == null ? null : e.value); }

Let's take a look at the removeEntryForKey method source code:

 /** * Removes and returns the entry associated with the specified key * in the HashMap. Returns null if the HashMap contains no mapping * for this key. */ final Entry<K,V> removeEntryForKey(Object key) { int hash = (key == null) ? 0 : hash(key); int i = indexFor(hash, table.length); Entry<K,V> prev = table[i]; Entry<K,V> e = prev; while (e != null) { Entry<K,V> next = e.next; Object k; if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k)))) { modCount++; size--; if (prev == e) table[i] = next; else prev.next = next; e.recordRemoval(this); return e; } prev = e; e = next; } return e; }

We see that when calling the remove method, we will first use the hashCode value of the object to find the object and then delete it. This problem is because we have modified the value of the y attribute of the r3 object. And because the hashCode method of the RectObject object has y value participating in the operation, the hashCode of the r3 object has changed, so r3 is not found in the remove method, so the deletion failed. That is, the hashCode of r3 has changed, but the location it stores has not been updated and is still in its original location, so when we use its new hashCode to find it, we will definitely not find it.
In fact, the above method is very simple to implement: as shown in the figure:

A very simple linear hash table, the hash function used is mod, the source code is as follows:

 /** * Returns index for hash code h. */ static int indexFor(int h, int length) { return h & (length-1); }

This is actually a mod operation, but this kind of operation is more efficient than % operation.

1,2,3,4,5 means the result of mod, and each element corresponds to a linked list structure. Therefore, if you want to delete an Entry<K,V>, you will first get hashCode, so as to get the header node of the linked list, and then iterate through the linked list. If hashCode and equals are equal, delete this element.
The above memory leak tells me a message: If we participate in the hashCode operation of the object's attribute value, we cannot modify its attribute value when deleting it, otherwise serious problems will occur.

In fact, we can also look at the hashCode method and equals method of the object types corresponding to the 8 basic data types.

Among them, the hashCode of the basic type in 8 is very simple to directly return their numerical size. The String object is through a complex calculation method, but this calculation method can ensure that if the values of this string are equal, their hashCode will be equal. The 8 basic types of equals methods are to directly compare numeric values, and the String type equals method compares the values of strings.

The above is all the content of this article. I hope it will be helpful to everyone's learning and I hope everyone will support Wulin.com more.