top | item 10892696

How we used Category Theory to solve a problem in Java

134 points| it | 10 years ago |techblog.realestate.com.au | reply

32 comments

order
[+] archgoon|10 years ago|reply
This post really needs to explain why plausible alternative solutions (ideally ones used in real world open source projects) which are not monoids rapidly lead to having to deal with increased complexity. Otherwise, it's just bringing in a lot of terminology waving their hands saying "We came up with a solution that let us have mutiple extensions!" which, on the surface, seems pretty boilerplate.
[+] theoh|10 years ago|reply
It's worse than that, I think. They had some code that expects input in a certain format, they wanted it to accept input in a variety of other formats, and category theory was apparently necessary to come up with the idea of a format converter from each new format to the original format. Extremely basic. Trying to pretend it's a CT application is just pretentious in the extreme.
[+] tome|10 years ago|reply
Imagine you've got ten pieces you need to stick end to end

    a b c d e f g h i j
How many ways can you do that with a binary operation? i.e. how many ways can you completely parenthesise that expression. e.g.

    ((a (((b (c d)) e) ((f g) h)) i) j)
I haven't worked it out exactly, but I think it's about a trillion.

On the other hand, suppose your binary operation is associative. How many different ways are there now? Effectively one.

That's a huge reduction in complexity. The notion of associativity is extremely useful in developing composable programs.

[+] bjacks|10 years ago|reply
I totally agree with this, it was like 'here's a solution using category theory', but what were the alternatives? How is this useful/applicable to me, and why should I learn category theory?
[+] agentgt|10 years ago|reply
I never really gave it that much thought but I guess we use extensive category theory as well in our code base via RxJava and our own functional abstractions. IMO modern Java both code and design really has been going in that direction for some time (Guava, RxJava, Java 8, reactive programming etc). I think it boils down to the fact that composition of async things without functor like behavior is a rough. That is I want to say how the process should be but not actually run the process and delegate that to something else.
[+] pklausler|10 years ago|reply
They lost me at the point, after demonstrating how a functor can be defined on a list, when they claimed that a functor can similarly be defined on a binary tree.

One of the ten or so "Aha!" moments that each Haskell novice must experience on one's journey to mastery is the realization that you can't make Set an instance of Functor, at least as Functor is commonly defined in the standard Prelude. And the reason why one cannot do so is echoed by their misleading diagram in which their functor replaces each node in a binary tree with its mapped value. What needs to be appreciated is that the resulting tree is no longer ordered (assuming that the original was), and cannot be used to locate an element in logarithmic time.

[+] platz|10 years ago|reply
They didn't say it was a binary search tree. The term Tree can simply be a root value and subtrees of children with a parent node, represented as a set of linked nodes.
[+] drostie|10 years ago|reply
So this is an interesting bit of thought but I'm trying to figure out what it corresponds to in Haskell. Clearly what they're doing with `S` is effectively a `Reader s` monad in `Control.Monad.Reader`, and clearly what they're doing with `a -> a` is the `Endo a` monoid in `Data.Monoid`.

We can definitely create a `newtype` which composes any applicative with a monoid in the appropriate way:

    newtype ApMonoid a m = ApMonoid (a m)
    instance (Applicative a, Monoid m) => Monoid (ApMonoid a m) where
        mempty = pure mempty
        mappend (ApMonoid x) (ApMonoid y) = Wrap $ mappend <$> x <*> y
Then they have two data types: (1) they pull in a bunch of `Extras` up front and then each transformer successively pares down the extras to some subset of data that it cares about, then (2) each function transforms the search `Results`. We could say that this is:

    data Extras = Extras {e1 :: {- type1 -}, e2 :: {- type2 -}, ... eN :: {- typeN -}}
and their resulting transformation looks like:

    transN (eN extra) . ... . trans2 (e2 extra) . trans1 (e1 extra) :: Results -> Results
so I guess I'm trying to store `transN . eN` as something like an `ApMonoid (Reader Extras) (Endo Results)` via the above. The logic looks pretty solid.

However, if it doesn't kill the basic logic, it might be more Haskell-y to compose these by lifting the monoid to an applicative using `Const`, so that in this case:

    newtype ApAp a b x = ApAp (a (b x)) deriving (Functor)
    instance (Applicative a, Applicative b) => Applicative (ApAp a b) where
        pure = pure . pure
        (<*>) = liftA2 (<*>)
We would instead say that this is `ApAp (Reader Extras) (Const (Endo Results))`. Can anyone with more experience with such things comment on whether those are the same? I'm a little shaky here.

Also, there's no generic way to "drop" an applicative to be simply a monoid, right? We need an existing Monoid to feed to `ApMonoid`, no?

[+] Kutta|10 years ago|reply
`type Extension s = s -> Endo Result` is already good enough IMO, since it's `Monoid` and we can also contramap thanks to `Profunctor (->)`. We don't have to bring in `ApAp`, which is anyway called `Compose` in `Data.Functor.Compose`.
[+] dmichulke|10 years ago|reply
Looks to me like a very long blog post about why you should use clojure.

I.e., use map and comp and avoid objects (to add 'extensions' the way you want).

The category theory + Java part certainly helps selling this to big enterprises though.

[+] chris_wot|10 years ago|reply
I'm slightly amazed that realestate.com.au has such an advanced tech blog...

Note: I should note that the reason for this was getting to be about 9 years ago, L.J. Hooker (who I believe owns realestate.com.au) were using Btrieve in it's original incarnation to manage all their listings. Total nightmare.

[+] mcbain|10 years ago|reply
Yeah, it is interesting, as the other big RE listings player in Australia also has a tech blog: http://tech.domain.com.au/

realestate.com.au is Murdoch, in fact, but was a Melbourne startup many years ago.

[+] jjviana|10 years ago|reply
"default <T> Extension<T> contraMap(Function<T, S> f) { return (T src, Results results) -> Extension.this.apply(f.apply(src), results); }"

Java 8 Lambdas provide such an elegant syntax for anonymous classes... When lambdas were first introduced in Java , a lot of people were underwhelmed by them. It is hard to evaluate at first the impact they can have on a well designed API.

[+] pka|10 years ago|reply
Is this

    foldMap :: Monoid m => (a -> m) -> t a -> m

?
[+] tel|10 years ago|reply
That just seems broken?

If `f` is Function<T, S> then given `T src`, like we have, then f.apply(src) has type S. But Extension.this.apply is applied to values of S even though T is the thing qualified as an Extension.

Actually, I'm just entirely not sure I know how to read Java type signatures.

[+] swehner|10 years ago|reply
What I understood is that the goal was to reduce the information contained within some search results.

Why would one need anything special to think this through?

[+] dkarapetyan|10 years ago|reply
You wouldn't. I think saying you used category theory to design something just sounds cool and is basically a shibboleth. In fact, any time you compose functions you're using category theory. I think it sounds a little pretentious to be honest. No one says they used boolean logic when they wrote some code so saying you used category theory or type theory sounds equally weird.

I work with a guy that has a PhD in some heavy duty categorical machinery and I haven't heard him once say he used category theory to design some piece of code.

[+] Kutta|10 years ago|reply
I'm skeptical that an ordinary Java person would ever think of this particular solution.

Composing a list of `A -> A` functions into a single one is already sort of a tall order. They would also have to realize that `(B, A) -> A` is equivalent to `B -> (A -> A)`, and that they can compose `A -> A` under the `B` environment. I just don't see this happening.