Deeply understand HashCode methods in Java

Author：Eve Cole Update Time：2025-08-20 12:16:01

About hashCode, in Wikipedia:

 In the Java programming language, every class implicitly or explicitly provides a hashCode() method, which digests the data stored in an instance of the class into a single hash value (a 32-bit signed integer).

hashCode extracts a 32-bit integer based on all data stored in an object instance. The purpose of this integer is to indicate the uniqueness of the instance. It is somewhat similar to MD5 code, and each file can generate a unique MD5 code through the MD5 algorithm. However, the hashCode in Java does not really implement the hashCode to generate a unique hashCode for each object, and there is still a certain chance of duplication.

Let’s take a look at the Object class first. We know that the Object class is the direct or indirect parent class of all classes in a java program and is at the highest point of the class level. Many common methods are defined in the Object class, including the hashCode method we want to talk about, as follows

 public final native Class<?> getClass(); public native int hashCode(); public boolean equals(Object obj) { return (this == obj); } public String toString() { return getClass().getName() + "@" + Integer.toHexString(hashCode()); }

Note that there is a native modifier in front of the hashCode method, which means that the hashCode method is implemented in a non-java language. The specific method is implemented externally and returns the address of the memory object.

In many Java classes, equals and hashCode methods are rewritten. Why is this? The most common String class, such as if I define two strings with the same characters, then when comparing them, the result I want should be equal. If you don't override the equals and hashCode methods, they will definitely not be equal, because the memory addresses of the two objects are different.

 public int hashCode() { int h = hash; if (h == 0) { int off = offset; char val[] = value; int len = count; for (int i = 0; i < len; i++) { h = 31*h + val[off++]; } hash = h; } return h; }

In fact, this code is the implementation of this mathematical expression

 s[0]*31^(n-1) + s[1]*31^(n-2) + … + s[n-1]

s[i] is the i-th character of a string, and n is the length of a String. Then why use 31 here instead of other numbers? Effective Java says this: The reason why 31 is chosen is because it is an odd prime number. If the multiplier is an even number and the multiplication overflows, the information will be lost, because multiplying with 2 is equivalent to shift operation. The benefits of using prime numbers are not obvious, but the hash results are conventionally used to calculate the hash results. 31 has a good feature, which is to use shift and subtraction instead of multiplication to obtain better performance: 31*i==(i<<5)-i. VMs can automatically complete this optimization.

As you can see, the String class uses its value value as a parameter to calculate hashCode, that is, the same value will definitely have the same hashCode value. This is also easy to understand, because the value values are the same, so if the equals comparison is also equal, the equals method is equal, then the hashCode must be equal. The other way around is not necessarily true. It does not guarantee that the same hashCode must have the same object.

A good hash function should look like this: produce unequal hashCodes for different objects.

In ideal cases, the hash function should evenly distribute unequal instances in the set to all possible hashCodes. It is very difficult to achieve this ideal situation, at least Java has not achieved it. Because we can see that hashCode is generated non-randomly, and it has certain rules, which is the mathematical equation above. We can construct some that have the same hashCode but different value values, for example: the hashCode of Aa and BB are the same.

The following code:

 public class Main { public static void main(String[] args) { Main m = new Main(); System.out.println(m); System.out.println(Integer.toHexString(m.hashCode())); String a = "Aa"; String b = "BB"; System.out.println(a.hashCode()); System.out.println(b.hashCode()); }}

Output result:

 Main@2a139a55 2a139a55 2112 2112

Generally, when rewriting the equal function, you also need to rewrite the hashCode function. Why is this?

Let's take a look at this example, let's create a simple class Employee

 public class Employee{ private Integer id; private String firstname; private String lastName; private String department; public Integer getId() { return id; } public void setId(Integer id) { this.id = id; } public String getFirstname() { return firstname; } public void setFirstname(String firstname) { this.firstname = firstname; } public String getLastName() { return lastName; } public void setLastName(String lastName) { this.lastName = lastName; } public String getDepartment() { return department; } public void setDepartment(String department) { this.department = department; }}

The Employee class above only has some very basic properties and getters and setters. Now consider a situation where you need to compare two employees.

 public class EqualsTest { public static void main(String[] args) { Employee e1 = new Employee(); Employee e2 = new Employee(); e1.setId(100); e2.setId(100); //Prints false in console System.out.println(e1.equals(e2)); }}

There is no doubt that the above program will output false, but in fact, the above two objects represent through a employee. The real business logic hopes that we return true.

To achieve this, we need to rewrite the equals method.

 public boolean equals(Object o) { if(o == null) { return false; } if (o == this) { return true; } if (getClass() != o.getClass()) { return false; } Employee e = (Employee) o; return (this.getId() == e.getId());}

Add this method to the above class and EauqlsTest will output true.

So are we done? No, let's change the test method and take a look.

 import java.util.HashSet;import java.util.Set;public class EqualsTest{public static void main(String[] args) {Employee e1 = new Employee();Employee e2 = new Employee();e1.setId(100);e2.setId(100);//Prints 'true'System.out.println(e1.equals(e2));Set<Employee> employees = new HashSet<Employee>();employees.add(e1);employees.add(e2);//Prints two objectsSystem.out.println(employees);}

The above program outputs two results. If the two employee objects equals return true, only one object should be stored in the Set. What is the problem?

We forgot the second important method hashCode(). Just as JDK's Javadoc said, if you rewrite the equals() method, you must rewrite the hashCode() method. Let's add the following method and the program will execute correctly.

 @Override public int hashCode() { final int PRIME = 31; int result = 1; result = PRIME * result + getId(); return result; }

Things to remember

Try to ensure that the same property of the object is used to generate two methods: hashCode() and equals(). In our case, we use employee ids.
The eqauls method must be consistent (if the object is not modified, equals should return the same value)
At any time as long as a.equals(b), a.hashCode() must be equal to b.hashCode().
Both must be rewritten at the same time.

Summarize

The above is all about this article's in-depth understanding of the HashCode method in Java, and I hope it will be helpful to everyone. Interested friends can continue to refer to other related topics on this site. If there are any shortcomings, please leave a message to point it out. Thank you friends for your support for this site!