"Now I'll tell you about some methods that are just as helpful: equals(Object o) & hashCode()."
"You've probably already remembered that, in Java, when comparing reference variables the objects themselves aren't compared, but rather the references to the objects."
Code | Explanation |
---|---|
|
i is not equal to j The variables point to different objects. Even though the objects contain the same data. |
|
i equals j. The variables contain a reference to the same object. |
"Yes, I remember that."
The equals.
"The equals method is the standard solution here. The purpose of the equals method is to determine whether objects are internally identical by comparing what is stored inside them."
"And how does it do that?""It's all very similar to the toString() method."
The Object class has its own implementation of the equals method, which simply compares the references:
public boolean equals(Object obj)
{
return (this == obj);
}
"Great... back to that again, are we?"
"Keep your chin up! It's actually very tricky."
"This method was created to allow developers to overwrite it in their own classes. After all, only a class's developer knows what data is relevant and what is not when comparing."
"Can you provide an example?"
"Sure. Suppose we have a class that represents mathematical fractions. It would look like this:"
class Fraction
{
private int numerator;
private int denominator;
Fraction(int numerator, int denominator)
{
this.numerator = numerator;
this.denominator = denominator;
}public boolean equals(Object obj)
{
if (obj==null)
return false;
if (obj.getClass() != this.getClass() )
return false;
Fraction other = (Fraction) obj;
return this.numerator* other.denominator == this.denominator * other.numerator;
}
}
Example method call: |
---|
Fraction one = new Fraction(2,3); Fraction two = new Fraction(4,6); System.out.println(one.equals(two)); |
The method call will return true. The fraction 2/3 is equal to the fraction 4/6 |
"Now, let's dissect this example."
"We overrode the equals method, so Fraction objects will have their own implementation.
"There are several checks in the method:"
"1) If the object passed for comparison is null, then the objects are not equal. If you can call the equals method on an object, then it is definitely not null."
"2) A class comparison. If the objects are instances of different classes, then we won't try to compare them. Instead, we'll immediately use return false to indicate that these are different objects."
"3) Everyone remembers from the second grade that 2/3 is equal to 4/6. But how do you check that?"
2/3 == 4/6 |
---|
We multiply both sides by both divisors (6 and 3), and we get: |
6 * 2 == 4 * 3 |
12 == 12 |
General rule: |
If a / b == c / d Then a * d == c * b |
"Accordingly, in the third part of the equals method, we cast the passed object to a Fraction and compare the fractions."
"Got it. If we simply compare the numerator with the numerator and the denominator with the denominator, then 2/3 is not equal to 4/6."
"Now I understand what you meant when you said that only a class's developer knows how to compare it correctly."
"Yes, but that's only half the story. There's another method: hashCode()."
"Everything about the equals method makes sense now, but why do we need hashCode()?"
"The hashCode method is needed for quick comparisons."
"The equals method has a major downside: it works too slowly. Suppose you have a Set of millions of elements and need to check whether it contains a specific object. How do you do that?"
"I could cycle through all the elements using a loop and compare the object with each object in the set. Until I find a match."
"And if it's not there? We'd perform a million comparisons just to find out that the object isn't there? Doesn't that seem like a lot?"
"Yes, even I recognize that is too many comparisons. Is there another way?"
"Yes, you can use hashCode() for this.
The hashCode() method returns a specific number for each object. A class's developer decides what number is returned, just like he or she does for the equals method.
"Let's look at an example:"
"Imagine that you have a million 10-digit numbers. Then, you could make each number's hashCode be the remainder after dividing the number by 100."
Here's an example:
Number | Our hashCode |
---|---|
1234567890 | 90 |
9876554321 | 21 |
9876554221 | 21 |
9886554121 | 21 |
"Yeah, that makes sense. And what do we do with this hashCode?"
"Instead of comparing the numbers, we compare their hashCodes. It's faster that way."
"And we call equals only if their hashCodes are equal."
"Yeah, that's faster. But we still have to make a million comparisons. We're just comparing smaller numbers, and we still have to call equals for any numbers with matching hashCodes."
"No, you can get away with a much smaller number of comparisons."
"Imagine that our Set stores numbers grouped or sorted by hashCode (sorting them in this way is essentially grouping them, since numbers with the same hashCode will be next to each other). Then you can very quickly and easily discard irrelevant groups. It's enough to check once per group to see whether its hashCode matches the object's hashCode."
"Imagine you're a student looking for a friend you can recognize by sight and whom we know lives in Dorm 17. Then you just go to every dorm at the university and ask, 'Is this Dorm 17?' If it's not, then you ignore everyone in the dorm and move on to the next. If the answer is 'yes', then you start walking past each of the rooms, looking for your friend."
"In this example, the dorm number (17) is the hashCode."
"A developer who implements a hashCode function must know the following:"
A) two different objects can have the same hashCode (different people can live in the same dorm)
B) objects that are the same (according to the equals method) must have the same hashCode..
C) hash codes must be chosen so that there aren't a lot of different objects with the same hashCode. If there are, then hashcodes' potential advantages are lost (You get to Dorm 17 and find that half the university lives there. Bummer!).
"And now the most important thing. If you override the equals method, you absolutely must override the hashCode() method and comply with the three rules described above.
"The reason is this: in Java, objects in a collection are always compared/retrieved using hashCode() before they are compared/retrieved using equals. And if identical objects have different hashCodes, then the objects will be considered different and the equals method won't even be called.
"In our Fraction example, if we made the hashCode equal to the numerator, the fractions 2/3 and 4/6 would have different hashCodes. The fractions are the same and the equals method says they're the same, but their hashCodes say they are different. And if we compare using hashCode before we compare using equals, then we conclude that the objects are different and we never even make it to the equals method."
Here's an example:
HashSet<Fraction>set = new HashSet<Fraction>(); set.add(new Fraction(2,3)); System.out.println( set.contains(new Fraction(4,6)) );
|
If the hashCode() method returns the fractions' numerator, the result will be false. And the «new Fraction(4,6)» object will not be found in the collection. |
"So what's the right way to implement hashCode for fractions?"
"Here you need to remember that equivalent fractions must have the same hashCode."
"Version 1: the hashCode equals the result of integer division."
"For 7/5 and 6/5, this would be 1."
"For 4/5 and 3/5, this would be 0."
"But this option is poorly suited for comparing fractions that are deliberately less than 1. The hashCode (result of integer division) will always be 0."
"Version 2: the hashCode equals the result of integer division of the denominator by the numerator."
"This option is suitable for instances where the fraction is less than 1. If the fraction is less than 1, then its inverse is greater than 1. And if we invert all of the fractions, then the comparisons are in no way affected."
"Our final version combines both solutions:"
public int hashCode()
{
return numerator/denominator + denominator/numerator;
}
Let's test it using 2/3 and 4/6. They should have identical hashCodes:
Fraction 2/3 | Fraction 4/6 | |
---|---|---|
numerator / denominator | 2 / 3 == 0 | 4 / 6 == 0 |
denominator / numerator | 3 / 2 == 1 | 6 / 4 == 1 |
numerator / denominator + denominator / numerator |
0 + 1 == 1 | 0 + 1 == 1 |
"That's all for now."
"Thanks, Ellie. That was really interesting."
GO TO FULL VERSION