Pages: Welcome | Projects

Comparators and Equality, +1

2015/7/9
Tags: [ Ideas ] [ java ]

The Comparable<T> interface of Java is somewhat redundant with the implicitly ubiquitous boolean equals(Object) method. Even though they address different goals (total order the first, semantic comparison the second), I personally feel happier when

foo.compareTo(bar) == 0 ⇔ foo.equals(bar)

Today I found myself implementing Comparable<Something> into a class Something, which was already supposed to implement the hashCode() and boolean equals(Object) methods. The problem arises from the fact I need to use a tree to store some data I also store in a hash table. And trees require total ordering.

I'm not really sure this is a best practice, but if I need something to be hashable, first thing I enforce it. So this was my starting point:

public static abstract class Something {
    public abstract int hashCode();
    public abstract boolean equals(Object o);
}

Now the question is: given this Something, how can we have the compareTo method playing nice with equals?

Given the fact I don't really care about the total ordering as long as lookups in a tree are correct, I came out with this:

public int compareTo(Something other) {
    if (other.equals(this)) return 0;

    final int my = hashCode();
    final int its = other.hashCode();
    return my - its;
}

This seems good enough: if the objects are equal the compareTo returns 0, otherwise it returns the difference of the two objects' hash values. Easy. But buggy.

This is the kind of error which doesn't show up often, but when it does you suddenly have objects mysteriously disappearing from your tree. Two unrelated objects A and B could have the same hash value. This might be unlikely if your Something.hashCode was chosen wisely, but it can happen. If it does, the comparison of two non-equals objects will return 0. If you inserted both A and B into a tree in sequence, the former gets replaced by the latter.

The solution is: just avoid 0 unless other.equals(this). You can do it by keeping negative values, and shifting of one every positive integer:

public int compareTo(Something other) {
    if (other.equals(this)) return 0;

    final int my = hashCode();
    final int its = other.hashCode();
    final int out = my - its;

    /* +1 is to avoid returning 0, which would be equals */
    return out < 0 ? out : out + 1;
}

Then I recalled that difference with arbitrary values can be sneaky because of overflows. There could be three objects A, B and C such that, because of an overflow:

A.compareTo(B) < 0
B.compareTo(C) < 0
C.compareTo(A) < 0

…and this is hardly a total ordering. Let's avoid the problem entirely and go with the plain old less-than comparison:

public int compareTo(Something other) {
    if (other.equals(this)) return 0;

    final int my = hashCode();
    final int its = other.hashCode();

    /* Simply skip the case `my == its` => 0, and let it
     * return 1 anyway.
     */
    return my < its ? -1 : 1;
}