The other day I was talking to my colleague about an elegant way to convert a Java array to an
ArrayList. For some reason, this immediately set off red flags in my head. As you’ll see in
the chat log below, I immediately responded with a barrage of questions.
Aloha, the machine learning and feature extraction library I wrote, has integrations with Microsoft’s Multiworld Testing (MWT) java library to help with the logging of exploration data. When my colleague mentioned Vowpal Wabbit’s ActionProbs, I realized that he was referring to the interactions with the MWT code.
After staring at the
interface for a second or two, I thought “why?”, echoing back to the chat I just had. I was
thinking “it’s weird how specific this API is”. There’s no reason that one could possibly
ArrayList at an API level. The OOP
programmer in me thought “no one should care that much about the implementation of a list.”
What the user really cares about is a random access
In OOP languages like Java, we often hear about programming to an interface. This is very common but it starts to break down when we want to program to multiple interfaces. Fortunately Java has some support for intersection types. By intersection types, I mean multiple inheritance. While Java (before Java 8) only allows single inheritance, there is limited support for describing classes that implement multiple interfaces. This is pretty seamless when passing an argument with an intersection type to a function, but this seems much more problematic when trying to return a value from a function. For instance the following works (in Java 6+):
Here the function accepts any random access
just like we wanted, and returns the size of the list. But how do we return an intersection type?
I don’t think it’s really possible to do in a generic way. This is a deficiency in the Java
language but makes sense since Java only allowed single inheritance. An astute programmer will
notice things in the Java standard library like
which returns a regular
List but states in the javadoc,
The returned list is serializable and implements RandomAccess.
This is not, however, codified at the type-level, which should be disconcerting to anyone who has ever noticed a disparity over time between API documentation and the code it professes to document. As a consequence, we cannot do something like:
The best partial workaround that I can find is to add a type parameter to the class or interface containing the method with an intersection type as a return type. For instance, we could change the Scorer interface from above to be:
but this introduces a new type parameter which may be undesirable and it falls short in at least two ways. It doesn’t alleviate the programmer’s burden of specifying a concrete implementation because he would still have to write something like:
ArrayList again appears in the type signatures. This is because while type bounds can be
encoded using intersection types, the concrete type cannot be specified as an intersection type.
The second way in which this strategy falls short is in static methods. This doesn’t help static
methods in any way. As a hackish workaround, one could move the type parameter encoding the
intersection type to the method. This seems like a faux pas because the return value must be cast
and the Java compiler will issue an “Unchecked cast” warning, indicating this may result in future
issues. And this is indeed possible here. Any API user could inadvertently subvert the benefits
of this strategy by supplying a valid type argument to the method, which adheres to the type bounds,
but that disagrees with the actual type of the value returned by the method. In which case, a
ClassCastException will be thrown. For instance, given a method:
the following results in the throwing of a
If the API user swears on his honor not to be a bad actor, this could potentially be useful because functions returning insection types could be interleaved with functions that accept them without ever having to provide type arguments. For instance, the following is valid:
BUT the whole point of this article, and for that matter type systems themselves, is that APIs and interactions with APIs shouldn’t involve an honor system, but rather, should be encoded in the type signatures provided by the API.
So while APIs like the
API are most likely too precise and APIs like
Arrays.asList are imprecise, providing the
proper amount of precision is difficult in Java when it involves the interaction of two or more
orthogonal interfaces. So what can be done?
Switch to Scala!
Scala, which runs on the JVM, has much better facilities for such types due to support for multiple inheritance and mixin composition. So while properly encoding these APIs in Java was hard, it’s rather easy in Scala.
And using the APIs is easy too (as shown in the Scala REPL).
Types that don’t adhere to the full type constraint are disallowed, like in the following since
LinkedList is not a
And ones that do adhere are successfully accepted like in the case of
is a random access list):
It’s important to think about the specificity of APIs and only require of the user what is necessary. Presumably this should involve programming to interfaces, but this is rather hard to guarantee at the type-level in Java. This has led to Java APIs that are either too specific or rely on the honor system rather than the contractual requirements of the APIs themselves. A richer type system like Scala’s type system can overcome these deficiencies, thereby allowing API developers to ask for no more than is necessary (even when requesting multiple orthogonal constraints). The ultimate consequence is that API developers will develop more general libraries and will avoid pigeonholing API users into specific implementations. This really seems like a win-win situation.