Blog - Functional IO in Scala with Scalaz

Functional IO in Scala with Scalaz

Last time we posted on Scalaz we looked at it from 30-thousand feet, exploring the generic concepts that make up the library’s foundation. This time, we will focus on a core concept which Scalaz gives us the abstraction for: Functional IO! Functional IO is not a concept given to us by Scalaz but instead it comes from the functional programming paradigm. Scalaz gives us an implementation of this. We will be exploring both in this post.

When I was first introducing myself to Functional Programming I had a hard time wrapping my head around how one would limit side-effects yet still get stuff done in the real world. Working primarily in the web application and services space I knew that every application I built would perform an abundant amount of IO (read: side-effects). Because of this, at first, I had a hard time understanding practical applications & benefits of functional programming. Many introductory talks on functional programming gloss over the idea of IO, touching on the fact that its possible and done regularly — and not much else. My goal in this post is to address how we can perform IO in the functional paradigm and hopefully convince you to give it and other parts of Scalaz a try as well.

Going Pure

So how do we write functions that perform side-effects yet are still pure? That question is downright contradictory but like I said above, it was one of the first questions I asked. The problem is we are asking the wrong question. Let’s rephrase it a bit: “how do we write programs with pure functions yet still perform side effects?”. See the subtlety? We aren’t asking how our functions perform side-effects but how our programs composed of pure functions are able to perform side-effects. This subtle difference makes it very easy to see the simple trick behind how Functional IO works. We do not write functions that perform IO actions, we write functions that build up an action that when explicitly executed will perform the real IO. When we call a function that returns an IO action no operations are actually performed. We must explicitly do something with the action to “run it”. This allows our functions to still be pure; given the same inputs they will always output the same IO action. This is a pretty intuitive concept — we are “configuring how to run a side-effecting action,” in my opinion, and as we will see in this post, this is actually a pretty familiar concept as well.

In Scalaz, we represent this idea using the IO[A] trait, in scalaz.effects. You must import this package on top of scalaz._ and Scalaz._ to use IO. I’ve been playing a little fast and loose with the term IO and the Scalaz IO construct up until now, so let’s formalize that quickly. From now on I will be referring to the Scalaz construct for working with input-output actions in a functional manner as IO or IO[A]. If I mention the concept of performing input-output actions generally I will explicitly call them as such.

Our First IO

IO gives us everything we need to construct our input-output actions, but first, we must know how to take our normal functions that perform input-output and make them into an IO. We do this by lifting the function into the context of an IO action. The end result is an IO wrapping our function without yet executing it. In Scalaz we can lift a value into a context using the pure[T] method defined in the Identity pimp. For IO this looks something like:

// remember to import scalaz.effects._
scala> val hiWorld = println("hello, world").pure[IO]
hiWorld: scalaz.effects.IO[Unit] = scalaz.effects.IO$$anon$2@77925ae

Notice how nothing was printed to the console. This is because we have not executed the action yet. Instead, the entire expression returned a value — an IO[Unit] to be specific. The type parameter, A, in IO[A] represents the result type of the input-output action. In the case of println this is Unit but it could be anything. Maybe a record from a database or even the result of a shell command. Once we have an IO[A] we can treat it like any other value. We can pass it around or build new IO[A] from the original. Like any good functional data structure, IO is immutable. When we want to “modify it” we build new actions while the original is persisted.

To execute our newly minted action we call unsafePerformIO. This well-named method does exactly what it says. It performs the input-ouput action and is totally unsafe. There could be exceptions thrown, you could be calling “rm -rf /” or launching a nuclear missile.

scala> hiWorld.unsafePerformIO
hello, world

IO cannot stop buggy code from removing all the files on your machine or accidentally launching a nuclear missile; it just gives you the control to build up your entire workflow around an input-output operation including things like how to handle the result (maybe you want to transform it). Also, because the possibility of exceptions in side-effecting code are a fact of life, you can also “configure” what happens when you call unsafePerformIO on an action and an exception is thrown.

To see how it all works, let’s look at a real world example using IO. You will not be familiar with all the functions used in the examples, at first, but we will address them as we go.

IO in the Wild

At StackMob, one database we make use of is Riak. The great guys at Basho provide an open-source Java Client that has a wonderful high-level model for working with data stored in Riak, but like using most Java libraries from Scala, it is always more pleasant if we have a nice wrapper. We decided to write such a wrapper, recently, that follows the same high-level model as its Java counterpart but with a bit of a functional & Scala twist. As a result, the code makes heavy use of Scalaz’s IO. You will not need to know anything about Riak really to follow along, except for maybe that the client lets us fetch, store and delete key-value pairs in Riak. If you aren’t familiar with this awesome database, it’s worth reading about. Additionally, the code shown here (excluding the previously mentioned open-source Java client) has not been open-sourced — it’s simply not in a state that is ready to be released — but hopefully will be in the future since all of us at StackMob are big believers in using and contributing to open-source software.

As I mentioned above, in my opinion, using IO is a fairly intuitive and familiar concept. This really hit home for me when I began using the Riak Java Client and researching how we might adapt it to Scala. One of the core pieces of the Riak Java Client codebase is the very simple interface RiakOperation<T>. It looks like this:

//https://github.com/basho/riak-java-client/blob/master/src/main/java/com/basho/riak/client/operations/RiakOperation.java
public interface RiakOperation<T> {
  T execute() throws RiakException;
}
view raw riakop.java This Gist brought to you by GitHub.

If we look at part of the definition of IO[A] and unsafePerformIO the story is almost identical (I’ve roughly translated the above Java code to Scala for easier comparison):

sealed trait IO[A] {
  def unsafePerformIO: A = ...
}

trait RiakOperation[T] {
  @throws(classOf[RiakException])
  def execute: T
}

Those are only different, pretty much, in the choice of names. The fact that in the translated RiakOperation[T] the trait is not sealed is an irrelevant detail. The other subtle, but important, difference is that IO is a generic, unified interface we can use for any type of input-output operation we want while RiakOperation is a specialized interface for making requests to Riak. We see the benefit of these highly generic structures all over Scala. For example, we use functions as arguments so that we don’t need to build specialized interfaces into our code. You don’t have to implement some imaginary FoldFunction interface to call foldLeft on a list, you just pass foldLeft a function. We also use tuples to group related values instead of grouping them using a custom sub-class. With IO, we have a generic interface for passing around and working with code that will perform side-effects in the real world.

In the Java client RiakOperation<T> is implemented several times including StoreObject, FetchObject and DeleteObject. We obtain an instance of the operation, configure it and execute it. The operation uses the underlying HTTP or Protobufs client to submit the request. In our Scala version, we simply use IO to do this. The other thing to note is that RiakOperation<T> instances are often mutable where IO is not.

An example of constructing a RiakOperation<T> and executing it:

val deleteSomeKey: DeleteObject = bucket.delete("some-key")
deleteSomeKey.execute() // actually makes the request to delete the key

Looking back to our small console session with IO that looks pretty familiar. The execution, as we saw from the definition, is just calling a method by a different name.

So how do we implement something like fetching data from Riak using IO? The Java implementation using RiakOperation is below (note: some implementation details in both the Java and Scala code have been left out or simplified since they are pertinent to the implementation of the client and not how to use IO).

// https://github.com/basho/riak-java-client/blob/master/src/main/java/com/basho/riak/client/operations/FetchObject.java
public T execute() throws UnresolvedConflictException, RiakRetryFailedException, ConversionException {
  // fetch, resolve
  Callable<RiakResponse> command = new Callable<RiakResponse>() { // 1
    public RiakResponse call() throws Exception {
      return client.fetch(bucket, key, builder.build()); // setup call via underlying client
    }
  };
  rawResponse = retrier.attempt(command); // 2
  final Collection<T> siblings = new ArrayList<T>(rawResponse.numberOfValues()); // 3

  for (IRiakObject o : rawResponse) { // 4
    siblings.add(converter.toDomain(o));
  }
      
  return resolver.resolve(siblings); // 5
}

The execute code follows a similar pattern to how we deal with IO. At #1 (// 1) an input-output action is being set up. At #2 the action is performed and in #3 through #5 the result is processed and finally returned.

Here is how we might do this using IO:

def rawFetch(key: String): IO[RiakResponse] = {
  val emptyFetchMeta = new FetchMeta.Builder().build() // how this really built is unimportant
  rawClient.fetch(name, key, emptyFetchMeta).pure[IO] // 6, rawClient is equivalent to client above
}
 
def fetch[T](key: String): IO[Validation[Throwable, Option[T]] = {
  (rawFetch(key) map {

     riakResponseToResult(_) // 7

  }) except { t => t.fail.pure[IO] } // 8
}

def riakResponseToResult[T](r: RiakResponse): Validation[Throwable, Option[T]]] = ...
view raw fetch.scala This Gist brought to you by GitHub.

Both functions shown above return an IO[A]. We will get to why we have two later. The A in the fetchRaw function is a RiakResponse, the raw result of our IO operation. In the fetch function A is Validation[Throwable, Option[T]]. If you are not familiar with Validation from Scalaz, you can fudge the return type for the purposes of this post to be IO[Either[Throwable, Option[T]]] instead.

The first IO action we construct to build our final one is our actual input-ouput, or request to Riak in this case. We see, at #6, the familiar pure[IO] like we used before, but this time to lift the function that performs the request. At #7, we process the results just like we did in the Java version, but there is a huge difference! We haven’t actually performed the input-output like we did in the Java client (see #2). Instead, using map we are sort of throwing the processing function (riakResponseToResult) into the future when we have actually called unsafePerformIO and the request to Riak returned (without throwing an exception). This is the function that takes a RiakResponse from the IO action returned by fetchRaw and turns it into a ValidationNEL[Throwable, T] which is then returned.

In the final line of the implementation, #8, we see one very simple way to configure exception handling using IO, the except function. If we call except on an IO instance and pass it a function Throwable => T, then any exceptions thrown when calling unsafePerformIO will be returned as the result of calling that function, passing in the exception. Some other useful exception handling functions include catchLeft, catchSomeLeft and bracket, you can check out the comments and implementation of all of them in the IO source.

Composing that ish

Writing a Riak Client is a great way to explore the idea of composition and IO. In Riak, before we write data for a given key we must always read its value first. This is a perfect opportunity for code reuse and composition. In the Java version the StoreObject, which encapsulates performing writes, uses a FetchObject (whose execute method we just saw) instead of duplicating the logic. We want to use the same thing with our IO except, as you’ll see, its even better.

Because IO is warm and fuzzy we get the awesome map & flatMap methods that allow us to process the results of a single IO action or chain together multiple actions. Chaining IO actions, fetch then store, is exactly what we want to do. This is how we might define the store function using this power via a for-comprehension:

def store[T](obj: T): IO[Validation[Throwable, Option[T]]] = {
  val emptyStoreMeta = new StoreMeta.Builder().build() // not important how this is really built
  val key = // how we get the key from the object is also just an implementation detail
  (for {
    resp <- rawFetch(key)
    fetchRes <- riakResponseToResult(resp).pure[IO] // fetchRes is a Validation[Throwable, Option[T]]
  } yield {
     fetchRes flatMap {
       mbFetched => {
         val objToStore = // removing some implementation details here
         riakResponseToResult(rawClient.store(objToStore, emptyStoreMeta)) // 9
       }
     }
  }) except { t => t.fail.pure[IO] }
}

view raw store.scala This Gist brought to you by GitHub.

In this function we reuse the fetchRaw method from above that returns an IO and lift our processing function riakResponseToResult, which returns a Validation[Throwable, Option[T]] into an IO and compose the two. Subsequently, we flatMap over fetchRes to add the input-output operation to store data in Riak to the mix (#9). Notice we don’t lift the call to rawClient.store using pure[IO]. Inside of the yield we are in the context of IO and so the side-effect won’t be executed until we call unsafePerformIO on the IO returned as a result of the entire for-comprehension (and subsequent call to except).

Where Next?

So IO is actually pretty simple and familiar. I saw a comment in the source for RiakOperation that really drives this point home, for me.

Just like {@link Callable}, hey, wait, maybe it *should* just be replaced with {@link Callable}

The RiakOperation is even more general. It’s just a Java Callable and we just showed how we can generalize it in a pure way using IO! If your throwing you’re hands up in the air saying “but I can run Callables concurrently!” fear not — this is more than possible using IO and Scalaz Promises or even certain actor implementations.

If Functional IO has piqued your interest a bit you are only at the beginning of the journey. In the grand scheme of things, reads and writes to a key-value store are a simple task to model in IO. There are many more difficult actions we could attempt to implement and in some cases we need other constructs, like the Iteratee, which is useful for things like iterating through a database cursor or lines in a file. If you’d like to read more I highly suggest reading these posts on the subject, which provide some good insight into the potential pitfalls of IO, as well as this haskell wiki page on IO in Haskell, whether or not you speak Haskell, and this introduction to iteratees.

In case you had any problems rendering the inline gist, click here to see the gist on github.

If you like working for a company that fosters this sort of learning and exploration, and enjoy working on challenging problems whose solutions will benefit app developers, we’re hiring!.

Don’t have a StackMob account ? Signup below.

Start Building

First Name
Email Address
By signing up you are agreeing to the terms of use.
StackMob helps developers build, deploy and scale feature-rich mobile applications easier and faster than ever before.
Learn More...
Tags
Get Connected