CodeGym /Java Blog /Java Objects /HashCode Method in Java: Best Practices
Author
Milan Vucic
Programming Tutor at Codementor.io

HashCode Method in Java: Best Practices

Published in the Java Objects group
Part 1: Equals Method in Java - Best Practices Hi! Let's talk about the hashCode() method in Java. Why is it necessary? For exactly the same purpose — to compare objects. But we already have equals()! Why another method? The answer is simple: to improve performance. A hash function, represented in Java using the hashCode() method, returns a fixed-length numerical value for any object. In Java, the hashCode() method returns a 32-bit number (int) for any object. Comparing two numbers is much faster than comparing two objects using the equals() method, especially if that method considers many fields. If our program compares objects, this is much simpler to do using a hash code. Only if the objects are equal based on the hashCode() method does the comparison proceed to the equals() method. By the way, this is how hash-based data structures work, for example, the familiar HashMap! The hashCode() method, like the equals() method, is overridden by the developer. And just like equals(), the hashCode() method has official requirements spelled out in the Oracle documentation:
  1. If two objects are equal (i.e. the equals() method returns true), then they must have the same hash code.

    Otherwise, our methods would be meaningless. As we mentioned above, a hashCode() check should go first to improve performance. If the hash codes were different, then the check would return false, even though the objects are actually equal according to how we've defined the equals() method.

  2. If the hashCode() method is called several times on the same object, it must return the same number each time.

  3. Rule 1 does not work in the opposite direction. Two different objects can have the same hash code.

The third rule is a bit confusing. How can this be? The explanation is quite simple. The hashCode() method returns an int. An int is a 32-bit number. It has a limited range of values: from -2,147,483,648 to +2,147,483,647. In other words, there are just over 4 billion possible values for an int. Now imagine that you're creating a program to store data about all people living on Earth. Each person will correspond to its own Person object (similar to the Man class). There are ~7.5 billion people living on the planet. In other words, no matter how clever the algorithm we write for converting Person objects to an int, we simply don't have enough possible numbers. We have only 4.5 billion possible int values, but there are a lot more people than that. This means that no matter how hard we try, some different people will have the same hash codes. When this happens (hash codes coincide for two different objects) we call it a collision. When overriding the hashCode() method, one of the programmer's objectives is to minimize the potential number of collisions. Accounting for all these rules, what will the hashCode() method look like in the Person class? Like this:

@Override
public int hashCode() {
   return dnaCode;
}
Surprised? :) If you look at the requirements, you will see that we comply with them all. Objects for which our equals() method returns true will also be equal according to hashCode(). If our two Person objects are equal in equals (that is, they have the same dnaCode), then our method returns the same number. Let's consider a more difficult example. Suppose our program should select luxury cars for car collectors. Collecting can be a complex hobby with many peculiarities. A particular 1963 car can cost 100 times more than a 1964 car. A 1970 red car can cost 100 times more than a blue car of the same brand of the same year. equals and hashCode methods: best practices - 4In our previous example, with the Person class, we discarded most of the fields (i.e. human characteristics) as insignificant and used only the dnaCode field in comparisons. We're now working in a very idiosyncratic realm, in which there are no insignificant details! Here is our LuxuryAuto class:

public class LuxuryAuto {

   private String model;
   private int manufactureYear;
   private int dollarPrice;

   public LuxuryAuto(String model, int manufactureYear, int dollarPrice) {
       this.model = model;
       this.manufactureYear = manufactureYear;
       this.dollarPrice = dollarPrice;
   }

   // ...getters, setters, etc.
}
Now we must consider all the fields in our comparisons. Any mistake could cost a client hundreds of thousands of dollars, so it would be better to be overly safe:

@Override
public boolean equals(Object o) {
   if (this == o) return true;
   if (o == null || getClass() != o.getClass()) return false;

   LuxuryAuto that = (LuxuryAuto) o;

   if (manufactureYear != that.manufactureYear) return false;
   if (dollarPrice != that.dollarPrice) return false;
   return model.equals(that.model);
}
In our equals() method, we haven't forgotten all the checks we talked about earlier. But now we compare each of the three fields of our objects. For this program, we need absolute equality, i.e. equality of each field. What about hashCode?

@Override
public int hashCode() {
   int result = model == null ? 0 : model.hashCode();
   result = result + manufactureYear;
   result = result + dollarPrice;
   return result;
}
The model field in our class is a String. This is convenient, because the String class already overrides the hashCode() method. We compute the model field's hash code and then add the sum of the other two numerical fields to it. Java developers has a simple trick that they use to reduce the number of collisions: when computing a hash code, multiply the intermediate result by an odd prime. The most commonly used number is 29 or 31. We will not delve into the mathematical subtleties right now, but in the future remember that multiplying intermediate results by a sufficiently large odd number helps to "spread out" the results of the hash function and, consequently, reduce the number of objects with the same hash code. For our hashCode() method in LuxuryAuto, it would look like this:

@Override
public int hashCode() {
   int result = model == null ? 0 : model.hashCode();
   result = 31 * result + manufactureYear;
   result = 31 * result + dollarPrice;
   return result;
}
You can read more about all of the intricacies of this mechanism in this post on StackOverflow, as well as in the book Effective Java by Joshua Bloch. Finally, one more important point that is worth mentioning. Each time we overrode the equals() and hashCode() method, we selected certain instance fields that are taken into account in these methods. These methods consider the same fields. But can we consider different fields in equals() and hashCode()? Technically, we can. But this is a bad idea, and here's why:

@Override
public boolean equals(Object o) {
   if (this == o) return true;
   if (o == null || getClass() != o.getClass()) return false;

   LuxuryAuto that = (LuxuryAuto) o;

   if (manufactureYear != that.manufactureYear) return false;
   return dollarPrice == that.dollarPrice;
}

@Override
public int hashCode() {
   int result = model == null ? 0 : model.hashCode();
   result = 31 * result + manufactureYear;
   result = 31 * result + dollarPrice;
   return result;
}
Here are our equals() and hashCode() methods for the LuxuryAuto class. The hashCode() method remained unchanged, but we removed the model field from the equals() method. The model is no longer a characteristic used when the equals() method compares two objects. But when calculating the hash code, that field is still taken into account. What do we get as a result? Let's create two cars and find out!

public class Main {

   public static void main(String[] args) {

       LuxuryAuto ferrariGTO = new LuxuryAuto("Ferrari 250 GTO", 1963, 70000000);
       LuxuryAuto ferrariSpider = new LuxuryAuto("Ferrari 335 S Spider Scaglietti", 1963, 70000000);

       System.out.println("Are these two objects equal to each other?");
       System.out.println(ferrariGTO.equals(ferrariSpider));

       System.out.println("What are their hash codes?");
       System.out.println(ferrariGTO.hashCode());
       System.out.println(ferrariSpider.hashCode());
   }
}

Are these two objects equal to each other? 
true 
What are their hash codes? 
-1372326051 
1668702472
Error! By using different fields for the equals() and hashCode() methods, we violated the contracts that have been established for them! Two objects that are equal according to the equals() method must have the same hash code. We received different values for them. Such errors can lead to absolutely unbelievable consequences, especially when working with collections that use a hash. As a result, when you override equals() and hashCode(), you should consider the same fields. This lesson was rather long, but you learned a lot today! :) Now it's time to get back to solving tasks!

More reading:

Comments (1)
TO VIEW ALL COMMENTS OR TO MAKE A COMMENT,
GO TO FULL VERSION
ABeataCD Level 29, Wrocław, Poland Expert
22 April 2024
I really love Your explanations!Thanks