A Brutal Introduction to Arrows

Imagine being killed by a bow and arrow. That would suck, an arrow killed you? They would never solve the crime. "Look at that dead guy. Let’s go that way." — Mitch Hedberg

I seem to be one of the few people who absolutely adores arrows. I thought it might be helpful if I provided some insight to the advanced-level newbies regarding the practical use of arrows in Haskell. Plenty has already been written about what arrows are in category theory, how to implement the Arrow typeclass, and how to use the special arrow syntax. I want to talk about why I occasionally wake up in the morning thinking, "Maybe I could solve this problem using arrows!"

This article is just to share an extremely simple, intuitive and concrete example of an arrow. I’m not going to get into all the crazy amazing things arrows can do. I just want to show that they can do at least one cool useful thing.

Arrows are closely related to monads. For example, both arrows and monads can be used to capture side-effecting operations.

Monads have the kind * -> *, indicating a side-effecting operation with a single output type, and a binding operation m a -> (a -> m b) -> m b. The input to a monadic operation is provided via a pure function. Arrows, on the other hand, have a kind * -> * -> *, indicating a side-effecting operation with a single input type and a single output type, and a binding operation a b c -> a c d -> a b d.

When binding monads, it’s obvious from the type signature that the subsequent side-effecting operation does not even exist until after the output of the previous operation becomes available. It cannot, because it depends on the previous output and because we can inject any pure function to construct the subsequent operation based on completely arbitrary criteria.

When binding arrows, the subsequent side-effecting operation exists before execution begins.

All monads are arrows, but not all arrows are monads, and this observation is pertinent to their implementation in Haskell, but I’m going to restrict this article’s discussion to "interesting arrows," specifically, arrows that aren’t monads because they don’t implement ArrowApply. Interesting arrows are basically monads without flow control: you can’t generally choose what side-effecting actions to perform based on things you learn during the execution of the arrow.

This is why arrow-notation creates two scopes. Between the <- -< symbols, only values that were in scope before execution of the Arrow are in scope. Outside the <- -<, values that appear during the execution of the Arrow are also in scope.

For example, we (you and I) might have a monad that allows us to perform certain dangerous operations, like overwriting files. In a monadic context, we can not anticipate what any particular instance of the monad will choose to do. We might write a very complicated installation script that accesses many files. Do we have permission to write to all of those files? Do the files we want to read even exist? Do we access an infinite number of files?

Do we ever write to /dev/nuclear_missiles?

We would like to know the answers to these questions before running the installation script, otherwise we could be interrupted and leave the system in a chaotic state. Even if we could recover from an error, we would still be wasting the user’s time, which is the opposite of the thing that computers are for.

But if we implemented our IO environment using an arrow, we could anticipate all of the side-effecting operations, even have a list of files to be overwritten before the operation begins.

In our new file IO arrow, it will be impossible to read the name of a file from a file, and then write to that file dynamically, because all file names must be specified at the time the arrow is bound. That’s a pretty onerous restriction, but we can always add new operations later, if we need them.

Our arrow needs a list of accessed files and an IO action. The list of file paths is going to take the form of a monoid, while the sequence of IO actions will take the form of a Kleisli arrow.

data IORWA a b = IORWA [FilePath] (a -> IO b)

We need a category instance.

instance Category IORWA where

Implementing id is easy. Id accesses no files, so we give an empty file list. Return is the simplest monadic action that type checks.

id = IORWA [] return

The bind operation requires that we concatenate two lists of file paths, and bind the IO actions. (This is a little annoying, but note that the (.) operator specifies the preceding action second and the subsequent action first.)

(IORWA sa actionA) . (IORWA sb actionB) = IORWA (sa ++ sb) (\x -> actionB x >>= actionA)

At this point, we only have a category, which has only slightly more sophistication than a monoid. To get our tricky arrow syntax, we need to implement arr and first.

instance Arrow IORWA where

Arr, like id, doesn’t touch any files, so we provide an empty file list and inject the function into the arrow.

arr f = IORWA [] $ return . f

First is a little tricky. We’re accepting a side-effecting operation as a parameter, so we need to preserve that operation’s file list.

first (IORWA s action) = IORWA s $ \(x,k) -> do { x' <- action x; return (x',k) }

And we need read/write operations, in which we simply pack the file path parameter into the file list. Notice that we take the file path as a static parameter, but we take the data to write as an input to the arrow.

writeFileA :: FilePath -> IORWA String ()

writeFileA path = IORWA [path] $ \s -> writeFile path s

readFileA :: FilePath -> IORWA () String

readFileA path = IORWA [path] $ \_ -> readFile path

Using our arrow is as simple as exporting a function to the accessed file list and the IO action, as long as we refuse to export any way to corrupt the synchronization between the two fields.

Looking at Control.Arrow and the arrows library on hackage, there are a few things we could add to all this:

  • We could implement ArrowChoice. This would allow us to choose at runtime between accessing two different sets of files. Both possibilities would appear in our static accessible file list, but only one would actually be accessed.
  • We could use a modified ReaderArrow to capture rewriting rules for file paths, e.g., to specify a current working directory. We can’t use ReaderArrow directly, because it would route information through the monadic component of the computation.
  • We could use a WriterArrow to retain a log of all of the data we actually write.
  • We could use an ErrorArrow to recover from file system errors.
  • We could implement ArrowLoop based on the MonadFix instance of IO.
  • We could use the Automaton arrow to implement multi-phase read/write cycles. Perhaps the first phase would be read-only, then we could check the file list again before proceeding to the second phase.
  • We could re-implement what we just wrote in terms of the StaticArrow and Kleisli arrows,
    and get a metric ton of the above for free.

To play with this example, you could always git clone http://www.downstairspeople.org/git/IORWA.git.