I believe most people are familiar with the hash table data structure, and in many places, hash tables are used to improve search efficiency. There is a method in Java's Object class:
public native int hashCode();
According to the declaration of this method, it can be seen that the method returns a numerical value of type int and is a local method, so no specific implementation is given in the Object class.
Why does the Object class need such a method? What is its function? Today we will discuss the hashCode method in detail.
1. The function of hashCode method
For programming languages that contain container types, hashCode is basically involved. The same is true in Java. The main function of the hashCode method is to work normally with hash-based sets, such hash sets include HashSet, HashMap and HashTable.
Why do you say so? Consider a situation where when an object is inserted into a collection, how to determine whether the object already exists in the collection? (Note: duplicate elements are not allowed in the collection)
Perhaps most people will think of calling the equals method to compare one by one, and this method is indeed feasible. However, if there are already ten thousand pieces of data or more data in the set, if the equals method is used to compare one by one, efficiency will inevitably be a problem. At this time, the function of the hashCode method is reflected. When a new object is to be added to the collection, the hashCode method of the object is called first to obtain the corresponding hashcode value. In fact, in the specific implementation of HashMap, a table will be used to save the hashcode value of the object that has been saved. If there is no hashcode value in the table, it can be stored directly without any comparison. If the hashcode value exists, its equals method is called to compare with the new element. If the same is true, other addresses will not be stored. Therefore, there is a problem of conflict resolution here. In this way, the number of times the equals method is actually called is greatly reduced. To put it simply: the hashCode method in Java maps the information related to the object (such as the storage address of the object, the field of the object, etc.) into a numeric value according to certain rules, and this value is called a hash value. The following code is the specific implementation of the put method in java.util.HashMap:
public V put(K key, V value) { if (key == null) return putForNullKey(value); int hash = hash(key.hashCode()); int i = indexFor(hash, table.length); for (Entry<K,V> e = table[i]; e != null; e = e.next) { Object k; if (e.hash == hash && ((k = e.key) == key || key.equals(k))) { V oldValue = e.value; e.value = value; e.recordAccess(this); return oldValue; } } modCount++; addEntry(hash, key, value, i); return null; } The put method is used to add a new element to the HashMap. From the specific implementation of the put method, we can know that the hashCode method will be called first to obtain the hashCode value of the element, and then check whether the hashCode value exists in the table. If it exists, call the equals method to redetermine whether the element exists. If it exists, update the value value, otherwise add the new element to the HashMap. From here, we can see that the hashCode method exists to reduce the number of calls to the equals method and thus improve program efficiency.
Some friends mistakenly think that by default, hashCode returns the storage address of the object. In fact, this view is incomplete. It is true that some JVMs directly return the storage address of the object when implemented, but this is not the case most of the time. It can only be said that the storage address may be related to it. The following is the implementation of generating hash hash values in HotSpot JVM:
static inline intptr_t get_next_hash(Thread * Self, oop obj) { intptr_t value = 0 ; if (hashCode == 0) { // This form uses an unguarded global Park-Miller RNG, // so it's possible for two threads to race and generate the same RNG. // On MP system we'll have lots of RW access to a global, so the // mechanism induces lots of coherency traffic. value = os::random() ; } else if (hashCode == 1) { // This variation has the property of being stable (idempotent) // between STW operations. This can be useful in some of the 1-0 // synchronization schemes. intptr_t addrBits = intptr_t(obj) >> 3 ; value = addrBits ^ (addrBits >> 5) ^ GVars.stwRandom ; } else if (hashCode == 2) { value = 1 ; // for sensitivity testing } else if (hashCode == 3) { value = ++GVars.hcSequence ; } else if (hashCode == 4) { value = intptr_t(obj) ; } else { // Marsaglia's xor-shift scheme with thread-specific state // This is probably the best overall implementation -- we'll // likely make this the default in future releases. unsigned t = Self->_hashStateX ; t ^= (t << 11) ; Self->_hashStateX = Self->_hashStateY ; Self->_hashStateY = Self->_hashStateZ ; Self->_hashStateZ = Self->_hashStateW ; unsigned v = Self->_hashStateW ; v = (v ^ (v >> 19)) ^ (t ^ (t >> 8)) ; Self->_hashStateW = v ; value = v ; } value &= markOopDesc::hash_mask; if (value == 0) value = 0xBAD ; assert (value != markOopDesc::no_hash, "invariant") ; TEVENT (hashCode: GENERATE) ; return value;}This implementation is located in the hotspot/src/share/vm/runtime/synchronizer.cpp file.
Therefore, some people may say that can we directly judge whether two objects are equal based on the hashcode value? It is certainly not possible, because different objects may generate the same hashcode value. Although it is not possible to judge whether two objects are equal based on the hashcode value, you can directly judge that the two objects are not equal based on the hashcode value. If the hashcode values of the two objects are not equal, they must be two different objects. If you want to determine whether two objects are truly equal, you must use the equals method.
That is to say, for two objects, if the result obtained by calling the equals method is true, the hashcode values of the two objects must be equal;
If the result obtained by the equals method is false, the hashcode values of the two objects may not be different;
If the hashcode values of the two objects are not equal, the result obtained by the equals method must be false;
If the hashcode values of the two objects are equal, the result obtained by the equals method is unknown.
2.equals method and hashCode method
In some cases, when designing a class, programmers need to rewrite the equals method, such as the String class, but be sure to note that while rewriting the equals method, they must rewrite the hashCode method. Why do you say so?
Let's see an example below:
package com.cxh.test1;import java.util.HashMap;import java.util.HashSet;import java.util.Set;class People{ private String name; private int age; public People(String name,int age) { this.name = name; this.age = age; } public void setAge(int age){ this.age = age; } @Override public boolean equals(Object obj) { // TODO Auto-generated method stub return this.name.equals(((People)obj).name) && this.age== ((People)obj).age; }}public class Main { public static void main(String[] args) { People p1 = new People("Jack", 12); System.out.println(p1.hashCode()); HashMap<People, Integer> hashMap = new HashMap<People, Integer>(); hashMap.put(p1, 1); System.out.println(hashMap.get(new People("Jack", 12)); }}Here I have only rewrite the equals method, which means that if two People objects have the same name and age, they are considered to be the same person.
The original intention of this code is to output the result of "1", but in fact it outputs "null". Why? The reason is that you forget to rewrite the hashCode method while rewriting the equals method.
Although two objects with the same name and age are logically determined to be equal (similar to the String class), it should be known that by default, the hashCode method maps the storage address of the object. Then it is not surprising that the output result of the above code is "null". The reason is very simple, the object pointed to by p1 and
System.out.println(hashMap.get(new People("Jack", 12));The new People("Jack", 12) in this sentence generates two objects, and their storage addresses must be different. The following is the specific implementation of HashMap's get method:
public V get(Object key) { if (key == null) return getForNullKey(); int hash = hash(key.hashCode()); for (Entry<K,V> e = table[indexFor(hash, table.length)]; e != null; e = e.next) { Object k; if (e.hash == hash && ((k = e.key) == key || key.equals(k))) return e.value; } return null; }Therefore, when the hashmap is used to get, because the hashcdoe values obtained are different (note that the above code may get the same hashcode value in some cases, but the probability is relatively small, because although the storage addresses of the two objects are different, it is possible to get the same hashcode value), so the for loop will not be executed in the get method and will directly return null.
Therefore, if you want the above code to output the result "1", it is very simple. You just need to rewrite the hashCode method and make the equals method and hashCode method always logically consistent.
package com.cxh.test1;import java.util.HashMap;import java.util.HashSet;import java.util.Set; class People{ private String name; private int age; public People(String name,int age) { this.name = name; this.age = age; } public void setAge(int age){ this.age = age; } @Override public int hashCode() { // TODO Auto-generated method stub return name.hashCode()*37+age; } @Override public boolean equals(Object obj) { // TODO Auto-generated method stub return this.name.equals(((People)obj).name) && this.age== ((People)obj).age; }}public class Main { public static void main(String[] args) { People p1 = new People("Jack", 12); System.out.println(p1.hashCode()); HashMap<People, Integer> hashMap = new HashMap<People, Integer>(); hashMap.put(p1, 1); System.out.println(hashMap.get(new People("Jack", 12))); }}In this way, the output result will be "1".
The following passage is excerpted from the book Effective Java:
During program execution, as long as the information used in the comparison operation of the equals method has not been modified, the hashCode method must consistently return the same integer.
If the two objects are equal according to the equals method, then calling the hashCode method of the two objects must return the same integer result.
If the two objects are compared inequality according to the equals method, the hashCode method does not necessarily return different integers.
It is easy to understand the second and third articles, but the first article is often ignored. There is also a passage similar to the first one in page P495 in the book "Java Programming Thought":
"The most important factor when designing hashCode() is: whenever you call hashCode() on the same object, the same value should be generated. If an object is added to the HashMap with put(), and another hashCode value is generated when it is taken out with get(), then the object cannot be obtained. So if your hashCode method depends on the variable data in the object, users should be careful, because when this data changes, the hashCode() method will generate a different hash code."
Here is an example:
package com.cxh.test1; import java.util.HashMap;import java.util.HashSet;import java.util.Set;class People{ private String name; private int age; public People(String name,int age) { this.name = name; this.age = age; } public void setAge(int age){ this.age = age; } @Override public int hashCode() { // TODO Auto-generated method stub return name.hashCode()*37+age; } @Override public boolean equals(Object obj) { // TODO Auto-generated method stub return this.name.equals(((People)obj).name) && this.age== ((People)obj).age; }}public class Main { public static void main(String[] args) { People p1 = new People("Jack", 12); System.out.println(p1.hashCode()); HashMap<People, Integer> hashMap = new HashMap<People, Integer>(); hashMap.put(p1, 1); p1.setAge(13); System.out.println(hashMap.get(p1)); }}The result of this code output is "null", and everyone must be clear about the reasons.
Therefore, when designing hashCode methods and equals methods, if the data in the object is volatile, it is best not to rely on this field in the equals methods and hashCode methods.
The above is all the content of this article. I hope it will be helpful to everyone's learning and I hope everyone will support Wulin.com more.