My First Haskell Project - Part 1
2018-12-26
I have not always found it clear how to get started with Haskell, and I have a small library I want to make as an experiment so I figured I'd document the process as a sort of Getting Started. The tiny library I wish to make is for reading environment variables, which can be used for database connections or similar. I use MacOS but will try not to make anything too platform specific, and link to docs where info for other systems can be found where possible.
First, let's assume we have nothing Haskell-based on the system whatsoever, and start by installing Stack.
On MacOS this is as easy as running
brew install stack
but for other systems you will need to look at the Stack docs.
Assuming that's all gone swimmingly, let's start a new project, which we will call simple-env.
stack new simple-env
This will create a new folder called simple-env containing the following:
.gitignore
ChangeLog.md
LICENSE
README.md
Setup.hs
app
package.yaml
simple-env.cabal
src
stack.yaml
test
Looks great. There are two files here that describe our project... simple-env.cabal generated by Cabal, and package.yaml generated by Stack. Don't make the mistake I made and change the Cabal one directly - instead use package.yaml as the source of truth and allow Stack to auto-generate sensible .cabal files.
Here is our default package.yaml file:
name: simple-env
version: 0.1.0.0
github: "githubuser/simple-env"
license: BSD3
author: "Author name here"
maintainer: "example@example.com"
copyright: "2018 Author name here"
extra-source-files:
  - README.md
  - ChangeLog.md
# Metadata used when publishing your package
# synopsis:            Short description of your package
# category:            Web
# To avoid duplicated efforts in documentation and dealing with the
# complications of embedding Haddock markup inside cabal files, it is
# common to point users to the README.md file.
description: Please see the README on GitHub at <https://github.com/githubuser/simple-env#readme>
dependencies:
  - base >= 4.7 && < 5
library:
  source-dirs: src
executables:
  simple-env-exe:
    main: Main.hs
    source-dirs: app
    ghc-options:
      - -threaded
      - -rtsopts
      - -with-rtsopts=-N
    dependencies:
      - simple-env
tests:
  simple-env-test:
    main: Spec.hs
    source-dirs: test
    ghc-options:
      - -threaded
      - -rtsopts
      - -with-rtsopts=-N
    dependencies:
      - simple-env
If you're familar with the Javascript world, this isn't a million miles away from a package.json file.
We aren't going to need any new libraries for our project, but it seems sensible to explain how that's done. Let's install contravariant package, because why not.
Let's add it to here:
dependencies:
  - base >= 4.7 && < 5
so we have
dependencies:
  - base >= 4.7 && < 5
  - contravariant
We don't mind what version in this case - Stack will choose us a sensible one that fits with our other dependencies, that's what it's for.
Let's run
stack build
and watch the action.
The first time you run this on any given project, Stack will download the GHC compiler and all the libraries so you may wish to take a break and go and read War and Peace or something. After the initial wait subsequent builds will be very quick, but this one is a bit of a stinker.
Ok. Great, we have filled our hard drive with nonsense and we are ready to Haskell.
What else have we got in this folder?
Firstly, we have the app folder which contains one file, Main.hs. This is the entry-point to our application, and looks like this:
module Main where
import Lib
main :: IO ()
main = someFunc
When a Haskell program is run the main function in a module called Main is run, and then it is responsible for everything else that happens. Therefore we can deduce that this program is importing someFunc from somewhere and running that. Seems plausible. Let's run it and see what happens.
If we look back in package.yaml we have a section that looks like this:
executables:
  simple-env-exe:
    main: Main.hs
    source-dirs: app
    ghc-options:
      - -threaded
      - -rtsopts
      - -with-rtsopts=-N
    dependencies:
      - simple-env
What it telling us? Well, a bunch of things, but two that stick out.
- Firstly, yes, our assumptions were correct - our main source directory is appand the main file isMain.hs.
- Secondly, that our executable file is called simple-env-exe
Let's run it then!
stack exec simple-env-exe
It should just print "someFunc" to the screen and exit, which is admittedly quite underwhelming.
We can do better than this.
So it looks like someFunc is a function in src/Lib.hs. The whole file looks like this. Let's start work in here.
module Lib
    ( someFunc
    ) where
someFunc :: IO ()
someFunc = putStrLn "someFunc"
Our library is going to extract environment variables so they can be used in programs. This is helpful for stuff like database credentials that we don't want to save in version control.
Here is the MVP version:
module Lib
    ( someFunc
    ) where
import           System.Environment (lookupEnv)
someFunc :: IO ()
someFunc = do
    testValue <- lookupEnv "TEST_VALUE"
    putStrLn (showResult testValue)
showResult :: Maybe String -> String
showResult maybeValue = case maybeValue of
    Just value -> "TEST_VALUE" ++ ": " ++ value
    _          -> "TEST_VALUE could not found!"
Let's take this apart a bit.
import           System.Environment (lookupEnv)
Firstly, we have an import. We are using the lookupEnv function from System.Environment. If we look in the docs for it in Hackage we can see it has the following type signature:
lookupEnv :: String -> IO (Maybe String)
This means we need to give it a String (the name of the environment variable we wish to check for) and it will return a Maybe String. This means if it CAN find a String it will return it, if not, it won't explode or anything, which is handy. However, that Maybe String is wrapped in an IO. What does this mean?
It means that this function needs to do interaction with the outside world, therefore it must also be run from another IO function using bind. Think of IO like the electricity that allows access to side-effects and the outside world. It's where the problems are most likely to come from, so Haskell forces us to be very explicit about where it is and more importantly, where it isn't. Our main functions have access to IO, and they must pass the IO-ness to wherever else needs it.
Opinionated note: IO is one of the more complicated parts of Haskell, and it's somewhat unfortunate for adoption of the language that we have to deal with it in the first lines of any program. I would wager that this is why so many tutorials and books start with hacking in the ghci repl instead, as it saves having to have The IO Chat. I am a firm believer in uncomfortable truths, moreso if they are explained terribly like this, but if you are starting out, feel utterly free just to accept this part is a bit weird, learn to live with it, and come back to in depth later. The IO concept really is quite a good thing, it just presents something of a Learning Kerb.
We also have a helper function called showResult.
showResult :: Maybe String -> String
showResult maybeValue = case maybeValue of
    Just value -> "TEST_VALUE" ++ ": " ++ value
    _          -> "TEST_VALUE could not found!"
This function just takes our Maybe String and turns it into a String ready for us to print to the console. It uses pattern matching of the maybeValue to display either the result (value) that the environment variable was set to, or a fallback message. Therefore we should get:
showResult (Just "horses")
-- "TEST_VALUE: horses"
showResult (Nothing)
-- "TEST_VALUE could not be found!
Pretty OK, huh?
OK, lastly the glue function, someFunc.
someFunc :: IO ()
someFunc = do
    testValue <- lookupEnv "TEST_VALUE"
    putStrLn (showResult testValue)
What's the deal here? Ok. So firstly, the Do - it says that we're starting some Do notation which allows us to write in a slightly more imperative style. Like the IO concept in general it's a bit of a heavy concept to throw at beginners in the first few lines so let's also make a mental note to just nod and come back to the concept later. Again, it is a pretty neat thing, but confusing to start with.
Moving on, this line is more interesting:
testValue <- lookupEnv "TEST_VALUE"
So as discussed earlier, our lookupEnv takes a String (in this case "TEST_VALUE") and returns Maybe String wrapped in IO. This <- pulls it out of the IO so that in effect, this line is setting testValue to Maybe String. If lookupEnv finds an environment variable called "TEST_VALUE" then testValue will be Just "whatever_the_value_was", but if it fails, it'll be Nothing. Note that we were only able to even use the lookupEnv function because someFunc is also an IO function, so in effect we have passed the IO power along to lookupEnv to let it do it's magic.
Great stuff. We have done a thing. Now let's tell our wonderful user all about it.
putStrLn (showResult testValue)
OK. So this line just takes testValue (a Maybe String), uses showResult to turn it into a nice String that tells us what happened, and then uses putStrLn (put string line) to show it on the screen. putStrLn is another IO action, with this type signature:
putStrLn :: String -> IO ()
This means it takes a String, and then returns () (unit, the nothing value) inside an IO. Again, it must be run inside another IO context to get it's "power", as such. Also, the fact it returns () is helpful - our someFunc is of type IO () meaning it should also return nothing, so putStrLn is also doing that for us. Tidy.
Great stuff.
Let's try our program.
stack build
stack exec simple-env-exe
Assuming you've not got an environment variable called "TEST_VALUE" set you should see:
stack exec simple-env-exe
"TEST_VALUE could not found!"
Let's set one (assuming you're in Bash or Zsh):
export TEST_VALUE="horses!"
...and run the program again.
stack exec simple-env-exe
"TEST_VALUE: horses!"
Great stuff. It's not much of a library, but it's a start. Next time we'll make a nicer API for fetching multiple variables at once so it's actually a bit more helpful for our users.
Make sense? If not, why not get in touch?
Further reading: