Amateur Topologist

Everything but topology.

Name Your Type Variables!

Haskell’s type system is commonly touted as one of the most powerful features of the language; it’s often said that if a program compiles, it’s pretty likely to be correct. And while there are always going to be errors your type system can’t catch (logic errors, off-by-one-errors, etc.), I’ve found that the type system helps ensure that your programs are put together correctly. If it typechecks, you’re probably not making a higher-level logic error; you might be trying to fit a red peg into a blue hole, but you’re not trying to fit a square peg into a round hole, so to speak.

The problem is that figuring out what the type variables in a polymorphic type (i.e., type like length :: [a] -> Int) mean. Sure, in simple cases it’s simple; you look at a type signature like ifM :: Monad m => m Bool -> m a -> m a -> m a and you know that m is going to be some monad and a is some arbitrary action. But take a look at validate :: Monad m => Form m i e v a -> Validator m e a -> Form m i e v a, from digestive-functors. What do m, i, e, v, and a stand for? It’s pretty clear that m is some monad, but what about the others? It turns out that:

  • i represents the input to the form, or its environment (i.e., the submitted parameters)
  • e represents the the error type that can be produced
  • v represents the ‘view’ of the form (how it’s represented to the user)
  • a represents the actual evaluation of a form (for example, in a registration form, it might be data Registration = Registration Text ByteString)

So my argument is that it might be better to describe it as Form m input err view result -> Validator m err result -> Form m input err view result; this way, it’s clearer exactly what each type variable corresponds to.

Again, I’m not saying that single-letter type variables should be completely banned; something like (>>=) :: (Monad m) => m a -> (a -> m b) -> m b is simple enough. The conventions that a and b stand for the to type variables in a generic function and that m means ‘some arbitrary monad’ are strong enough that it’s like using i in a for loop in Python or whatever.

(Full disclosure: yes, lots of my own Haskell code uses single-letter type variables in places where I shouldn’t. I do intend to fix this).

But if you’re trying to learn some new framework or module, imagine how much nicer it would be if the types that the Haddock gives you actually told you something about what it’s doing (especially given that most Haskell modules seem to be lightly documented outside of their Haddock documentation, lacking example use cases). Programmers outgrew using single-letter variable names decades ago; we shouldn’t make the same mistake in the type system. This is part of the reason why I’m making such slow progress learning the Snaplet framework in the dev version of snap; it’s hard for me to get an intuition as to what the types are when I keep seeing Handler b v a (b is the base app/snaplet, v is what’s being ‘focused on’, and a is what the Handler ‘evaluates to’; Handler b v has a Monad (and Functor, and MonadPlus, etc.) instance).

By the way, the Snaplet framework, though slightly complicated, does wind up making a good deal of sense once you get used to it; I plan on writing up a post about what the commonly-used types/type constructors 'mean', as well as a post about how to make a very simple login Snap app. And I think, ultimately, documentation is one of the things Haskell sorely needs; not just more documentation on what individual functions do, but how those functions combine into a larger whole. Imagine trying to learn Django just by looking at the module documentation without having the benefit of the tutorial! But I think that's a subject for another post; I'm trying to get this blog active again, after all.

5 ResponsesLeave one →

  1. Bertrand Russell

     /  December 2, 2011

    Great post and awesome idea. Hopefully everyone reads this and makes retroactive changes to their libraries :)

    Reply
  2. anders_

     /  December 2, 2011

    I agree with this, but one minor quibble: I’d probably say that the `a`/`b` at the end of a type for the result is also pretty obvious, and can probably be left alone, just as `m` for monads. Otherwise, spot on.

    (Something that would also be nice in the language as a whole would be the ability to label individual arguments for more documentation. Haddock could just display the arg names from the definition in some cases, of course, but then that wouldn’t work if there were multiple equations. Or any non-single-variable patterns at all, for that matter…)

    Reply
  3. Alternately, you could just use type families to reduce the need for random variables.

    Reply
  4. Tyler Eaves

     /  December 5, 2011

    How about ‘x’ for container items? (As an extension of the x:xs pattern matching style)

    Reply
  5. Lemming

     /  December 13, 2011

    In ‘Monad m => m a’, I would call ‘m a’ the monadic action and ‘a’ the monadic result.

    Reply

Leave a Reply