Reasons for using 31 coefficients when defining hashcode

Author：Eve Cole Update Time：2025-08-08 18:32:02

Hash calculation is to calculate which element should be placed in the array. To be precise, it is put in which link list. According to Java rules, if you want to put an object into a HashMap, your object's class must provide a hashcode method and return an integer value. For example, the String class has the following methods:

 public int hashCode() { int h = hash; int len = count; if (h == 0 && len > 0) { int off = offset; char val[] = value; for (int i = 0; i < len; i++) { h = 31*h + val[off++]; } hash = h; } return h; }

Pay attention to the for loop above, isn't it a bit messed up? Let me give you an example so that it is easy to understand what it is making. For example, there is a string "abcde" that uses a 31-digit calculation method to calculate the sum of this string. You will write the following calculation formula:

a*31^4+b*31^3+c*31^2+d*31^1+e*31^0. Note that a, b, c, d or e here refer to their ASCII values. Very interesting loops, which can be used to calculate N-digit. This loop can be extracted separately as a good tool for calculating the partition:

 public static void main(String[] args) { int[] a={1,0}; System.out.println(calculate(2,a)); } private static int calculate(int radix,int[] a){ int sum = 0; for(int i=0;i<a.length;++i){ sum = sum*radix+a[i]; } return sum; }

The static method caculate accepts radix as the cardinality, and array a simulates the number of the calculating to be calculated, just pay attention to the consistent surface order. For example, the 01 binary string should be arranged in the array according to {0,1}. The output above is 1, which meets the true value of 01.

So why use 31 as the base? First, you need to understand why HashCode is needed. Each object calculates HashCode based on the value. Although this code size does not require uniqueness (because this will usually be very slow), it should be as much as possible and not repeated as possible, so the cardinality should be as large as possible. In addition, 31*N can be optimized by the compiler to shift left by 5 bits and then reduce by 1, which has higher performance. In fact, it is still controversial to choose 31, please refer to it here.

I think this thing will still lead to more repetitions and should use larger numbers. So, perhaps there may be some changes in Java implementation in the future. The following article introduces two conclusions:

1. Use prime numbers for cardinality

The characteristics of prime numbers (only 1 and itself are factors) can make the result obtained by multiplying it with other numbers easier to produce uniqueness than other methods, that is, the probability of conflict between hash code values is the smallest.

2. Choice 31 is a choice after observing the distribution results. The reason is not clear, but it is indeed beneficial.

In addition, the first calculated value will be cached internally, because this is a final (immutable) class, that is, the content of the String object will not change. This can improve performance when putting HashMap multiple times, but it seems to be of little use.

Summarize

The above is all about the reasons why 31 coefficients are used when defining hashcode. I hope it will be helpful to everyone. Interested friends can continue to refer to this site:

" Detailed introduction to rewriting hashCode() and equals() methods "

" Detailed explanation of the essential differences and connections between hashCode() and equals() "

" Java Source Code Analysis of HashMap Usage "

If there are any shortcomings, please leave a message to point it out. Thank you friends for your support for this site!