Harald Kirsch

genug Unfug.

2014-10-07

Java equals and canEqual

I stumbled upon canEqual in Scala code and immediately wondered why this is needed. Searching for Java, equals and canEqual brings up tons of hits on Google, but most of them will have some relation to Scala. When reading the first Google hit it becomes clear that the idea of canEqual has its place in Java too.

The article is well written and describes the problem along examples. I thought to take a more compact and formal approach to concisely describe what an equals method may do and what not, in particular taking inheritance into consideration.

The equals contract

The Javadoc specifies that equals must implement an equivalence relation, which is a relation that is reflexive ($a=a$), symmetric ($a=b\Rightarrow b=a$) and transitive ($a=b\wedge b=c \Rightarrow a=c$).

Ok, symmetric, hmmm? Calling an object's method, like in a.equals(b)is inherently non-symmetric, since it is a's equals that is called and b is "only" a parameter. Suppose

A a = new A(...)

and of course when you implemented A you took utmost care to make it symmetric.

Comes along your colleague, half a year later, and writes:

class B extends A {...}

You forgot to make equals a final method to make sure no derived class can ruin your well crafted equals method. And your colleague thinks that B deserves its own, specific equals method. What are the constraints?

Forced to call super.equals

Symmetry, oooohkeeey!? With

B b = new B(...)

he is forced to ensure that when he implements equals such that

b.equals(a) $\to$ true

which uses B.equals, he has to make sure that also

a.equals(b) $\to$ true

which uses the "old" A.equals of your class. Now consider some arbitrary

A a1 = new A(...) such that a.equals(a1) $\to \alpha$ and
B b1 = new B(...) such that a.equals(b1) $\to \beta$.

The transitivity requirement forces your colleague to make sure that comparing b to a1 and b1 returns the exact same results $\alpha$ and $\beta$ as when comparing a. Since the two were arbitrary objects of A and B, this is true for all elements of these two classes. The safest way to get this result is to make sure that b.equals(...) calls super.equals(...). To summarize

Conclusion 1: If an object B b = new B(...) of a subclass of A shall have b.equals(a)$\to$true for at least one a of A, then for this b the parent implementation super.equals() should be called to treat b as if it were genuinely of class A.

The small room for a new (in)equality

If every b has at least one a of A with which it shall be equal, then obviously super.equals would be always called. But that would mean we don't need to implement B.equals in the first place.

Consequently there is at least one B bx = new B(...) which has

bx.equals(a)$\to$false for all A a = new A(...)

This leads to

Conclusion 2: If a subclass B overrides equals of its parent class A, its objects belong to one of two disjoints sets:
  1. those which have at least one B.equals partner from A and
  2. those that are not equal to any element of the superclass.
The second set must not be empty, since otherwise the derived equals need not be implemented.

Enforcing the inequality for the superclass

Due to symmetry, a.equals(bx) must also return false for all objects a created with new A(...). But how can it be that the implementation of A.equals, which was written when no bx yet existed, can do just the right thing when some such new type of object comes along? What A.equals typically does is if (bx instanceof A), but this returns true for the objects of the derived class and makes no difference between as and bs.

The solution is the canEqual method featured in the title. But since the mentioned article describes it so well, I don't need to repeat this here.