I have been using Haskell’s Servant library for the specification of a rather large API. There are over 100 or so endpoints, and some records with dozens of fields.

haskellemoji

Certain aspects of Servant have been great for reducing the overhead of developing a large API. Servant does not solve everything though and I found that data representation became a non trivial portion of the code.

After playing around with a few different ways of reducing boilerplate for certain operations I found one method that I like more than the others.

API PUT And PATCH

Lets say we have a blog post representation inside our API.

data Post = Post
  { title :: Text
  , body  :: Text
  }

HTTP provides two verbs for updates of an existing resource these being PUT, and PATCH. A PATCH operation allows partial updates of the resource, for example just a posts title. A PUT operation requires a total replacement of the object. Clearly the above data type cannot model both of these cases.

We want a way to easily deal with both cases without substantial code duplication. This includes validation, serialization, parsing, etc. These two endpoints are almost identical, and it would be nice if they could share the majority of their implementation.

A naive approach would involve just making a second type and writing every required related function twice. While things like serialization can usually be coveniently generated automatically, complex form validation is more difficult. For trivial types this may be inexpensive but when records grow larger this can become a burden.

Existing Template Haskell

The Haskell Wiki even provides an example of wrapping fields of a record in a Maybe type with template Haskell. In this case mkOptional is defined elsewhere.

mkOptional ''Post

Producing

data Post_opt = Post_opt
  { title_opt :: Maybe Text
  , body_opt  :: Maybe Text
  }

This gives us one totally separate type for each case. We either need more template Haskell to generate required instances and code for validation, or we now have more manual work to do.

Template Haskell can be great for reducing boilerplate but it has its downsides. A similar result can be accomplished without TH code generation.

Higher Kinded Types

Both cases (and others) can be modeled with a single higher kinded type.

data Post a = Post
  { title :: a Text
  , body  :: a Text
  }

When we want to use this type for an endpoint we just parameterize it with whatever specific case we are going for.

import Data.Functor.Identity

-- All fields required
put :: Post Identity -> IO ()
put post = undefined

-- All fields optional
patch :: Post Maybe -> IO ()
patch post = undefined

The Identity here represents that a field is required. It is essentially just a newtype wrapper. Because we are working with HKD we are required to provide some context.

Example Instance

Manual

Here is an example of a manual Aeson serialization instance.

{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE OverloadedStrings #-}

import Data.Text
import Data.Aeson

instance (ToJSON1 a) => ToJSON (Post a) where
  toJSON post = object
    [ "title" .= toJSON1 (title post)
    , "body"  .= toJSON1 (body  post)
    ]

Now we have serialization for Post Maybe, Post Identity, and any other case we may end up needing. Of interest here is the ToJSON1 requirement.

This is part of the machinery required for working with higher kinded types in a generic way. Without this you would have to define a ToJSON instance for every single case you want to use.

Automatic Generic

If you want to use generic instances that are automatically generated for simple things like serialization the most convenient way is to monomorphize your a and use the standard derivation system.

{-# LANGUAGE DeriveGeneric #-}

import GHC.Generics

instance ToJSON (Post Maybe)
instance ToJSON (Post Identity)

Don’t forget to add deriving (Generic) to your type for the automatic GHC.Generics version.