Hash Principle

First of all, before we define the Java hashcode, we need to understand what is hashing and what is it for. Hashing is a process of applying a hash function to some data. A hash function is just a mathematical function. Don’t worry about this! “Mathematical” does not always mean “complicated”. Here it means only that we have some data and a certain rule that maps the data into a set of characters (code). For example, it could be a hexadecimal cipher. We have some data of any size at the input, and apply a hash function to it. At the output, we get a fixed-size data, say, 32 characters. Usually, that kind of function converts a big piece of data into a small integer value. The result of this function work is called a hash code. Hash functions are widely used in cryptography, and some other areas too. Hash functions can be different, but they all have certain properties:
  • A particular object has a particular hashcode.
  • If two objects are equal, their hashcodes are the same. The reverse is not true.
  • If the hash codes are different, then the objects are not equal for sure.
  • Different objects may have the same hash code. However, it is a very unlikely event. At this point, we have a collision, a situation, where we can lose data.
The "proper" hash function minimizes the probability of collisions.

Hashcode in Java

In Java hash function is usually connected to hashCode() method. Precisely, the result of applying a hash function to an Object is a hashcode. Every Java object has a hash code. In general Hash Code is a number calculated by the hashCode() method of the Object class. Usually, programmers override this method for their objects as well as related to hashCode() the equals() method for more efficient processing of specific data. The hashCode() method returns an int (4 bytes) value, which is a numeric representation of the object. This hashcode is used, for example, by collections for more efficient storage of data and, accordingly, faster access to them. By default, the hashCode() function for an object returns the number of the memory cell where the object is stored. Therefore, if no changes are made to the application code, then the function should return the same value. If the code changes slightly, the hashcode value also changes. What is the hashcode used for in Java? First of all Java hashcodes help programs run faster. For example, if we compare two objects o1 and o2 of some type, the operation o1.equals(o2) takes about 20 times more time than o1.hashCode() == o2.hashCode().

Java equals()

In the parent class Object, along with the hashCode() method, there is also equals(), the function that is used to check the equality of two objects. The default implementation of this function simply checks the links of two objects for their equivalence. equals() and hashCode() have their contract, so if you override one of them, you should override the other, in order not to break this contract.

Implementing the hashCode() method

Example

Let’s create a class Character with one field — name. After that, we create two objects of Character class, character1, and character2 and set them the same name. If we use the default hashCode() and equals() of the Object class, we’ll definitely get different, not equal objects.That’s how hashcode in Java works. They will have different hashcodes because they are in different memory cells and the equals() operation result will be false.

import java.util.Objects;

public class Character {
    private String Name;

    public Character(String name) {
        Name = name;
    }

    public String getName() {
        return Name;
    }

    public void setName(String name) {
        Name = name;
    } 

    public static void main(String[] args) {
        Character character1 = new Character("Arnold");
        System.out.println(character1.getName());
        System.out.println(character1.hashCode());
        Character character2 = new Character("Arnold");
        System.out.println(character2.getName());
        System.out.println(character2.hashCode());
        System.out.println(character2.equals(character1));
    }
}
The result of running the program:

Arnold
1595428806
Arnold
1072408673
false
Two 10-digit numbers in the console are hashcodes. What if we want to have equal objects if they have the same names? What should we do? The answer: we should override hashCode() and equals() methods of Object class for our Character class. We could do it automatically in IDEA IDE, just press alt + insert on your keyboard and choose Generate -> equals() and hashCode(). What is Java hashCode() - 2In the case of our example we’ve got the next code:

import java.util.Objects;

public class Character {
    private String Name;

    public Character(String name) {
        Name = name;
    }

    public String getName() {
        return Name;
    }

    public void setName(String name) {
        Name = name;
    }

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (!(o instanceof Character)) return false;

        Character character = (Character) o;

        return getName() != null ? getName().equals(character.getName()) : character.getName() == null;
    }

    @Override
    public int hashCode() {
        return getName() != null ? getName().hashCode() : 0;
    }

    public static void main(String[] args) {
        Character character1 = new Character("Arnold");
        System.out.println(character1.getName());
        System.out.println(character1.hashCode());
        Character character2 = new Character("Arnold");
        System.out.println(character2.getName());
        System.out.println(character2.hashCode());
        System.out.println(character2.equals(character1));
    }
}
The result of running this code:

Arnold
1969563338
Arnold
1969563338
true
So now the program identifies our objects as equal and they have the same hashcodes.

Java hashcode example:

Your own hashCode() and equals()

You may also create your own equals() and hashCode() realizations, but be careful and remember to minimize the hashcode collisions. Here is an example of our own hashCode() and equals() methods in the Student class:

import java.util.Date;

public class Student {
   String surname;
   String name;
   String secondName;
   Long birthday; // Long instead of long is used by Gson/Jackson json parsers and various orm databases

   public Student(String surname, String name, String secondName, Date birthday ){
       this.surname = surname;
       this.name = name;
       this.secondName = secondName;
       this.birthday = birthday == null ? 0 : birthday.getTime();
   }
//Java hashcode example
   @Override
   public int hashCode(){
       //TODO: check for nulls
       //return surname.hashCode() ^ name.hashCode() ^ secondName.hashCode() ^ (birthday.hashCode());
       return (surname + name + secondName + birthday).hashCode();
   }
   @Override
   public boolean equals(Object other_) {
       Student other = (Student)other_;
       return (surname == null || surname.equals(other.surname) )
               && (name == null || name.equals(other.name))
               && (secondName == null || secondName.equals(other.secondName))
               && (birthday == null || birthday.equals(other.birthday));
   }
}
And the Main class to demonstrate their work:

import java.util.Date;
import java.util.HashMap;
import java.util.Hashtable;

public class Main {
   static HashMap<Student, Integer> cache = new HashMap<Student, Integer>(); // <person, targetPriority>

   public static void main(String[] args) {
       Student sarah1 = new Student("Sarah","Connor", "Jane", null);
       Student sarah2 = new Student("Sarah","Connor", "Jane", new Date(1970, 01-1, 01));
       Student sarah3 = new Student("Sarah","Connor", "Jane", new Date(1959, 02-1, 28)); // date not exists
       Student john = new Student("John","Connor", "Kyle", new Date(1985, 02-1, 28)); // date not exists
       Student johnny = new Student("John","Connor", "Kyle", new Date(1985, 02-1, 28)); // date not exists
       System.out.println(john.hashCode());
       System.out.println(johnny.hashCode());
       System.out.println(sarah1.hashCode());
       System.out.println();
       cache.put(sarah1, 1);
       cache.put(sarah2, 2);
       cache.put(sarah3, 3);
       System.out.println(new Date(sarah1.birthday));
       System.out.println();
       cache.put(john, 5);
       System.out.println(cache.get(john));
       System.out.println(cache.get(johnny));
       cache.put(johnny, 7);
       System.out.println(cache.get(john));
       System.out.println(cache.get(johnny));
   }
}

What is hashcode used for?

First of all hashcodes help programs run faster. For example, if we compare two objects o1 and o2 of some type, the operation o1.equals(o2) takes about 20 times more time than o1.hashCode() == o2.hashCode(). In Java hashing principle stands behind some popular collections, such as HashMap, HashSet and HashTable.

Naive Implementations of hashCode()

Naive implementations of hashCode() are basic approaches to generate a hash code for an object. While these implementations can be functional, they often have significant limitations in terms of performance and collision handling.

Example 1: Returning a Constant Value

public class NaiveHashCodeExample1 {
    private String name;

    public NaiveHashCodeExample1(String name) {
        this.name = name;
    }

    @Override
    public int hashCode() {
        return 42; // Constant value
    }
}

This implementation, while valid, causes all objects to have the same hash code. As a result, hash-based collections like HashMap or HashSet lose their performance benefits because all elements end up in the same bucket.

Example 2: Using the length() of a String Attribute

public class NaiveHashCodeExample2 {
    private String name;

    public NaiveHashCodeExample2(String name) {
        this.name = name;
    }

    @Override
    public int hashCode() {
        return name.length();
    }
}

This implementation is slightly better but still problematic. It creates hash codes based solely on the length of the name attribute, leading to high collision rates for strings of the same length but different content.

Limitations of Naive Implementations

  • High Collision Rates: Naive implementations often result in multiple objects having the same hash code, reducing the efficiency of hash-based collections.
  • Poor Distribution: Hash codes generated by naive methods tend to cluster, undermining the performance of hash tables.
  • Lack of Scalability: Simple implementations may not scale well as the dataset grows in complexity or size.

The Importance of Overriding hashCode() with equals()

In Java, the contract between hashCode() and equals() is critical. If two objects are considered equal according to the equals() method, they must also have the same hash code. Failing to maintain this contract can lead to unpredictable behavior in hash-based collections.

Why This Contract Matters

  • Consistency: If hashCode() is not overridden when equals() is overridden, objects that are logically equal may not be treated as equal in collections like HashMap or HashSet.
  • Correct Functionality: Maintaining this contract ensures that hash-based collections work as intended, avoiding bugs like missing elements or failed lookups.

Example: Violating the Contract

import java.util.HashSet;

public class HashCodeEqualsExample {
    private String name;

    public HashCodeEqualsExample(String name) {
        this.name = name;
    }

    @Override
    public boolean equals(Object obj) {
        if (this == obj) return true;
        if (obj == null || getClass() != obj.getClass()) return false;
        HashCodeEqualsExample other = (HashCodeEqualsExample) obj;
        return name.equals(other.name);
    }

    // hashCode is not overridden
}

public class Test {
    public static void main(String[] args) {
        HashSet<HashCodeEqualsExample> set = new HashSet<>();
        HashCodeEqualsExample obj1 = new HashCodeEqualsExample("John");
        HashCodeEqualsExample obj2 = new HashCodeEqualsExample("John");

        set.add(obj1);
        System.out.println(set.contains(obj2)); // Output: false
    }
}

In this example, even though obj1 and obj2 are logically equal, set.contains(obj2) returns false because hashCode() is not overridden.

Proper Implementation of hashCode() and equals()

import java.util.Objects;

public class ProperHashCodeEqualsExample {
    private String name;

    public ProperHashCodeEqualsExample(String name) {
        this.name = name;
    }

    @Override
    public boolean equals(Object obj) {
        if (this == obj) return true;
        if (obj == null || getClass() != obj.getClass()) return false;
        ProperHashCodeEqualsExample other = (ProperHashCodeEqualsExample) obj;
        return name.equals(other.name);
    }

    @Override
    public int hashCode() {
        return Objects.hash(name);
    }
}

By overriding both hashCode() and equals(), the contract is maintained, ensuring correct behavior in hash-based collections.

Conclusion

Every Java object has the hashCode() and equals() methods inherited from Object class. To get a good working equality mechanism, you’d better override hashcode() and equals() methods for your own classes. Using hashcodes makes programs run faster.