by Carl Burch, Hendrix College, August 2012
We'll begin our study of Haskell by learning to write functions. For now, we'll equate the terms function and program. In most other languages, the notion of a program is not the same as the notion of a function. But we want to avoid this distinction because Haskell, as a purely functional language, cannot have simple input and output statements. Haskell does support input and output, but it's a more complex concept that we'll reach only much later. For now, we don't want to worry about it: We just want to talk about writing functions that will take a value and produce a value in response. And in this document, in fact, we'll only worry about functions that process numeric values.
To experiment with Haskell programming, I recommend that you get a Haskell interpreter such as Hugs (available at www.haskell.org/hugs/). There are compilers for Haskell that produce reasonably efficient machine code, but using them requires understanding input and output in Haskell, which as I said is a concept we'll tackle much later. With an interpreter, you can write your own functions in a file, load them up into the interpreter, and call them however you like.
We'll begin, though, without defining our own functions. We'll start with the functions already defined in the Haskell standard libraries (called the Haskell Prelude), and we'll learn how we can use these functions to build up more complex expressions. In contrast to imperative languages, where statements are the basic computational construct, in functional language the expression is the basic way of talking about computation. (Of course, imperative languages have expressions, too, but they're subsidiary to the statement. In functional languages, that's all we have.)
First we'll see how to use functions — for example, we might
want to use
the Prelude's sqrt
function, which computes the
square root of a floatingpoint value. To compute √5,
for instance, we can simply
type the following into the interpreter, and it would
print back the return value.
sqrt 5.0 
returns 2.23606797749979 
Note that when we call a function in Haskell, we do not place parentheses around the parameter. Of course, we could if we wanted, but they would be redundant. Since Haskell programs use lots of functions, it's convenient to leave them out.
We can use arithmetic operators, as in the following.
sqrt (3 * 3 + 4 * 4) 
returns 5.0 
sqrt (3 ** 2 + 4 ** 2) 
returns 5.0 
(The **
operator represents exponentiation.)
The parentheses around the argument in each example
are necessary, because function application has a higher order
of precedence than the arithmetic operators.
Leaving them out would be the same as writing
Notice that Haskell does observe the order of operations among
arithmetic operators: Exponentation ((sqrt 3) * 3 + 4 * 4
.**
) precedes
multiplication and division (*
and /
), which
themselves precede addition and subtraction (+
and

).
Many functions take multiple arguments. Two examples are the
max
function for computing the maximum of two numbers and
the rem
function for computing the remainder of dividing
two integers.
max 3.0 3.5 
returns 3.5 
rem 17 5 
returns 2 
Notice that we're again not using parentheses to surround the arguments. They would be necessary, however, if we want to use a function to compute the argument to another function.
max 3.0 (sqrt 5.0) 
returns 3.0 
Without the parentheses, the interpreter would parse this as
an attempt to pass three arguments into max
: 3.0, sqrt
,
and 5.0. The interpreter would complain that the expression
doesn't typecheck: The max
function expects two numeric
arguments, and it's being told to pass it three arguments, and
with second argument as a function (sqrt
), not a number.
Now suppose we want to define a function — maybe a
square
function that squares its argument.
That's easy enough to do.
square x = x * x
square
that takes a single parameter x
.
Now suppose we load this function into our interpreter. (With Hugs, that would involve writing the above into a file with a text editor and then instructing Hugs to load the file.) We would then be able to use the function just as we can use the functions defined in the Prelude.
sqrt (square 3.0 + square 4.0) 
returns 5.0 
You may look at this definition of square
and object
that the parameter x
looks suspiciously like a
variable, which is exactly what we're trying to avoid here.
But it is not a variable in one, crucial sense:
Its value, determined upon the creation of x
when entering
the function, cannot change. That is, there is no way for us to
vary its value, and so
it is not a varyable.
And in fact, functional programmers would not call x
a
variable — they would call it an identifier.
(One could argue, though, that functional languages have a stronger claim
to the term variable because they are faithfully following
the longstanding term as it has been used by mathematicians for
centuries, before imperative languages redefined the term.)
If we want to define a function taking multiple parameters, then we simply list multiple identifiers to identify the different arguments. For example, the following computes the value for x under the normal probability distribution (i.e., a bell curve) with a mean of 0 and a standard deviation of σ (sigma).
normal sigma x = exp (0.5 * square (x/sigma)) / (sqrt (2 * pi) * sigma)
This happens to use the exp
function (computing e^{x})
and the pi
identifier (representing π ≈ 3.14)
defined in the Prelude.
Haskell allows you to define additional identifiers to represent
helper values
using a where
clause.
For example, we might instead define normal
as the
following.
normal sigma x = exp (0.5 * square devs) / (factor * sigma)
where devs = x / sigma
factor = sqrt (2 * pi)
When we do this, and when we have multiple identifiers to define, it's important that the definitions each start in the same column. Otherwise, the Haskell interpreter will not be able to tell where one definition stops and another begins. (Generally, white space is not important to Haskell, but in this case it is.)
Sometimes, we may want the value of the function to be computed
different ways in different circumstances. For example, suppose
we want a distance
function to compute the distance a
vehicle has gone after t seconds, supposing that the vehicle
starts at a standstill, accelerates at a rate of a
m/s²,
to a terminal velocity of
v_{max} m/s.
The computation of the
distance will depend on whether enough time has elapsed for the
vehicle to reach its terminal velocity, which will occur at time
v_{max} / a.
We can make this determination using an
if
–then
–else
expression.
distance a vmax t = if t < timeToMax
then 0.5 * a * square t
else vmax * (t  0.5 * timeToMax)
where timeToMax = vmax / a
(The line breaks in this definition are present only because otherwise the line would be too long. The Haskell parser ignores them.) In this case, if t is too small for the vehicle to reach the terminal velocity, then we compute the distance based purely on a constant acceleration; but if t is larger than that, then we deduct from t half of the time to terminal velocity (since the average speed during this time is half of v_{max}) and multiply that by v_{max}.
The if
–then
–else
construct looks
similar to the if
–then
–else
statements that commonly occur in imperative languages, but
don't let the syntax fool you into thinking they are the same.
Haskell's if
–then
–else
expression
is always for computing a value of an expression,
and so the expression in the then
and in the
else
must both contain expressions whose values have
the same type.
One of the consequences is that in Haskell, the else
expression is always required.
In contrast,
if
statements in imperative languages typically have
statements to perform in the then
and the else
blocks; neither block has any type.
Actually, if you want an analogue to Haskell's
if
–then
–else
in imperative
languages, you should think of the ?
…:
ternary
operator in C/C++/Java.
In addition to the if
–then
–else
expression for testing conditions, Haskell also defines
guard notation, where you can place conditions on the
parameters before specifying the return value. This idea is
inspired by mathematical notation for handling conditions:
½ a t²  if t < t_{max} 
v_{max} (t  ½ t_{max})  otherwise 
Haskell has the condition precede the returnvalue expression, but otherwise it looks quite similar to the above mathematical notation.
distance a vmax t  t < timeToMax = 0.5 * a * square t
 otherwise = vmax * (t  0.5 * timeToMax)
where timeToMax = vmax / a
On a call to this distance
function, the function would
get the parameter values, then step through the guards to see
which is the first to be true for those parameters
(with otherwise
being a catchall), and then the return
value would be the value of the expression following that guard
expression.
Throughout the preceding discussion, we have used the boolean type
without defining it. All conditions — whether within the
if
of an if
–then
–else
expression or as a guard expression — must have a boolean
type. Haskell calls this type Bool
,
and its two truth values are represented as True
and False
.
The operators dealing with Booleans are the comparison
operators: ==
, /=
(notequals), <
,
<=
, >
, and >=
.
These fall below the arithmetic operators in the precedence
hierarchy. Below that is logical AND (&&
)
and below that is logical OR (
).
There is also the logical NOT, implemented as the regular
function not.
To summarize, here are the operators we have seen so far and their precedence levels.
operators associativity highest function application left **
(exponentiation)right *
,/
,`rem`
left +
,
left ==
,/=
,<
,<=
,>
,>=
none &&
left lowest 
left
Note that the comparison operators are not associative.
If you write
the parser will
complain that the order of evaluation is ambiguous. (And of
course, if you simply added parentheses to get
0 <= x < 1
,
the typechecker would complain that
you can't compare a boolean value (the result of (0 <= x) < 1
,
) to
an integer (1). You would need to write this is using logical AND:
0 <= x
)0 <= x && x < 1
.
In Haskell, binary operators such as +
and *
are functions, too, though they are used differently in
that their names are placed between the two arguments
(infix notation) rather
than before them (prefix notation). Haskell has two
categories of functions: functions whose names consist
solely of symbols (like +
), which are normally invoked
using infix notation, and functions whose names consist solely
of letters (like sqrt
), which are normally invoked
using prefix notation. (Negation (as in −x)
is the sole exception to the rule that all symbolic names are infix operators.
Haskell uses 
for both negation and subtraction, and
of course the subtraction function is an infix operator.)
For cases when this behavior is not desirable (and such cases occur with fair regularity), Haskell allows the programmer to talk about infix operators as prefix operators and vice versa. Functions with symbolic names are by default infix operators, but placing the name within parentheses changes it to a prefix operator
(*) 3.0 4.0 
returns 12 
Likewise, a function with a textual name is a prefix operator, but placing the name in backticks changes it to an infix operator.
3.0 `max` 3.5 
returns 3.5 
17 `rem` 5 
returns 2 
Obviously, this is relevant only for functions that take two
arguments. Incidentally, the Prelude defines the infix remainder
function (`rem`
) to have the same precedence as
multiplication and division.
Haskell has another feature for infix operators that is called the section: You can specify one of the arguments within the set of parentheses, getting a function that takes an argument and returns the result of applying the operator to both arguments. The following, for example, define two functions, half that returns the number that is half its argument, and negate that returns the negation of its argument.
half = (/2)
negate = (0)
The following illustrates these being used.
(/2) 13.0 
returns 6.5 
(0) 15 
returns −15 
One thing that is bound to frustrate imperative programmers is that functional languages don't have loops. They can't: The whole point of a loop is to repeat something until some condition no longer holds, but because functional programs don't have any variables, a condition's value can't change. Thus, whenever we have a repetitive process — and what's the point of programming if we can't mechanize repetitive processes? — the most important technique for iteration in imperative languages isn't available to us.
But there is that other technique — that technique that is denigrated in some circles as confusing, in others as inefficient: recursion. Luckily, it's not as confusing in functional programming, because the side effects cause a large part of the confusion. And we'll see later that there isn't necessarily a substantive efficiency concern. At any rate, recursion is the most important technique for iteration in functional languages.
We can illustrate this with the factorial function, which is typically represented as n!, but which we'll represent as fact(n). Mathematicians traditionally define the function recursively, as they define all iterative processes.
1  if n = 0 
n ⋅ fact(n − 1)  otherwise 
In Haskell, this would translate to the following.
fact n  n == 0 = 1
 otherwise = n * fact (n  1)
Haskell provides a different way of defining functions called
patterns. The idea here is that we write multiple
definitions of the function, but each time using a different
pattern for the arguments. For fact
, we'll have
one pattern that will match an argument of 0 only, while the
second pattern will match any value to the name n
.
fact 0 = 1
fact n = n * fact (n  1)
Given a function defined with patterns, the system will translate this to a function that receives the arguments, goes through the various patterns to find the first pattern that matches the arguments, and then computes the return value based on the return value for that pattern.
Pattern definitions become more interesting for multiargument functions. The following simple function computes the area of a rectangle given its width and height.
area 0 _ = 0
area _ 0 = 0
area w h = w * h
This definition uses the special underscore pattern, called the wildcard, which will match anything (like an identifier) and throw the value away (unlike an identifier). Given this definition, the function would first check whether the width is 0, in which case it returns 0; and if not, then it would check whether the height is 0, in which case it also returns 0; and if neither of those patterns matched, then it would have to multiply the width and the height.
Of course, the above isn't recursive. However, if for some reason we wanted to avoid multiplication, we could make it recursive by shaving off one unit from the width and the height, computing the area of the remainder, and then adding back the area of what we shaved off.
area 0 _ = 0
area _ 0 = 0
area w h = area (w  1) (h  1) + (w + h  1)
(This isn't exactly equivalent to the preceding function, though. One differences arises when given a negative width or height: the first definition would return a number, while the second would be infinitely recursive.)
Based on the discussion so far in this document, you should now be able to make full sense of the following, relatively sophisticated Haskell program.
testFactorsFrom i n
= (i * i > n)  (n `rem` i /= 0 && testFactorsFrom (i + 1) n)
isPrime n = testFactorsFrom 2 n
countPrimesTo n  n < 2 = 0
 isPrime n = 1 + countPrimesTo (n  1)
 otherwise = countPrimesTo(n  1)
It's good practice when you write a function to define the type
of the function's identifier. If you leave the type definition
out, then the Haskell interpreter/compiler will determine it for
you using a process called type inference. Given the
square
function above, for example, it would notice that
you are multiplying something with x, and so x must be a
number; and it would notice that the result is returned, and so
square
must return a number. But giving the types is good
practice for two reasons:
It enhances the readability of your program, as a reader doesn't have to figure out what sort of thing each argument is.
It can have a dramatic impact on the efficiency of a
program. In the square
example, the inferred type will
end up being able to take two integer values and produce an
integer value, and it will be able to take two floatingpoint values
and produce a floatingpoint value,
and it will be able to take two complex values and produce a
complex value. (We'll see how this works later,
when we get to the notion of type classes.)
In many compilers, though, square
will have to end up
determining which multiplication to perform at runtime, which
will impact slow the computation down considerably.
By declaring the type of the parameters, square
will know
the types of its parameters, and it will be able to go directly
into the appropriate multiplication routine without any runtime
type accesses.
To illustrate, we can define square
as a function that takes
a floatingpoint
value and returns a floatingpoint value. To do this, we use the
Double
type, which represents floatingpoint values.
(We could use the Integer
type instead if we wanted to be able to square
integers.)
square :: Double > Double
square x = x * x
The declaration of an identifier's type begins with the
identifier name, followed by a doublecolon, followed by the
type name. In this case, we have a function from Double
s to
Double
s, and we use the arrow notation to represent this.
Type names in Haskell always begin with a capital letter, while identifier names (including function identifiers) must always begin with a lowercase letter — unless, of course, the identifier happens to be an infix operator, in which case it would consist entirely of symbols.
What do we do for functions like normal
that take multiple
parameters? The answer is a bit interesting:
normal :: Double > (Double > Double)
What we're saying here is that normal
is a function that
takes a Double
and returns a function that itself takes
a Double
and produces a Double
. This is bound to strike
a newcomer to
Haskell as peculiar: We were thinking of normal
as a function
that takes two Double
s and computes a Double
value, but now we're
defining it as a function that takes a single Double
and produces
another function.
To make sense of this notion, look at how we would call
normal
.
normal 1.0 1.0 
returns 0.241970724519143 
Before, we thought of this as passing two Double
values into
normal
at once. But that's not how Haskell actually
interprets this. Instead, it interprets the expression as
passing 1.0 into normal
, and then passing
1.0 into the result: That is, Haskell interprets it the same as
as
: First, it computes the function
that (normal 1.0) 1.0
normal
produces given 1.0 as its parameter, and then
it computes the value that new function produces given 1.0 as
its parameter.
Fundamentally, Haskell understands
function application as a nameless leftassociative binary operator
that happens to have the highest precedence.
The arrow operator in type definitions is
rightassociative. (By contrast, arithmetic operators in
expressions are leftassociative: That is, given
2 − 3 − 4
in traditional mathematics and in Haskell, we process
lefttoright, computing
(2 − 3) − 4
rather than
2 − (3 − 4).)
Because of this, we can omit the parentheses in our type
declaration for normal
, and Haskell will assume we placed
parentheses around the rightmost.
normal :: Double > Double > Double
This concept of functions that produce other functions can be
used within Haskell. For example, suppose we want to define
threshold
that takes a number x and returns x if x
is positive and 0 otherwise. We can give a simple definition
of such a function using the Prelude's max
function.
threshold x = max 0 x  one definition
But an even simpler defintion is:
threshold = max 0  alternative definition
Here, we are defining threshold
as the function that
max
returns given 0. Of course, this function computed by
max
will take a number and return the maximum of 0 and
that number, which is exactly what we want threshold
to do.
Another example is the following double function that multiplies its argument by 2; this definition encloses the multiplication operator in parentheses so that we can treat it just like regular twoargument functions.
double = (*) 2
The transformation of a function from one that takes multiple
arguments to produce a value into another that takes a single argument
and produces another function to take another argument,
is called currying.
And a function, such as normal
, that works this way, is
called a curried function.
These terms derive from the name of Haskell Curry, who is also
the namesake of the Haskell language.
Curry didn't invent currying, but he used it
intensively enough that the idea became named after him.
Curry, who died in 1982, also predates the invention of
Haskell, introduced in 1990.
But he was instrumental in defining many of the concepts
that underlie functional languages in the early 20th century.
One of the interesting features of Haskell — and with most functional languages — is that it treats functions as regular data, just like integers, or floatingpoint values, or other types. Language specialists say that functions are firstclass data types, meaning that programs can work with them just as well as they can work with other data.
Curried functions — where a function takes a parameter and produces another function — is one facet of this. Another facet is that functions can take other functions as parameters. For example, we might want to estimate the area under a function curve (i.e., the integral) using the trapezoid method, where we divide the range of interest (a, b) into equalsized pieces.
We can easily write such a function, taking the function to be integrated as a parameter, along with the width of the trapezoids delta and the range of interest (a, b).
integrate :: Double > (Double > Double) > Double > Double > Double
integrate delta f a b  a + delta > b = (b  a) * (f a + f b ) / 2
 otherwise = delta * (f a + f a') / 2
+ integrate f delta a' b
where a' = a + delta
To find the area below the function √x and the function e^{x} for x in the range (0, 1), we would use the below expressions.
integrate 0.01 sqrt 0 1 
returns 0.666462947103147 
integrate 0.01 exp 0 1 
returns 1.71829614745042 
For those whose integral calculus is rusty or nonexistent, let me add that the exact values are 2/3 and e ≈ 1.71828. The values we got were pretty close.
Incidentally, the above definition is somewhat poor because it will end up evaluating the function twice at each point, which will lead it to be slower. Fixing this is just a matter of adding a helper function that takes f(a) for the current a as a parameter.
integrate :: Double > (Double > Double) > Double > Double > Double
integrate f delta a b = sub a (f a)
where sub a fa  a + delta > b = (b  a) * (fa + f b) / 2
 otherwise = delta * (fa + fa') / 2 + sub a' fa'
where a' = a + delta
fa' = f a'
We can also define functions that take a function and produce a
function as a result. For example, we might want to define a
function threshold
that takes a value z and a function
f and returns a new function g that is identical to f
except that it is clipped off
wherever f goes above z.
Essentially, we want the return function g to be one that takes a value x and returns the minimum of z and f(x). We can easily write exactly that in Haskell.
threshold :: Double > (Double > Double) > (Double > Double)
threshold z f = g where g x = min z (f x)
In writing the type for threshold
, that the final set of parentheses
is unnecessary, since the arrow in type declarations is
rightassociative anyway. And, indeed, this fact reveals that we
could equivalently define threshold
as follows, though
it's not quite as readable as our first try.
threshold :: Double > (Double > Double) > Double > Double
threshold z f x = min z (f x)
The Prelude defines a handful of functions defined to take
functions as parameters and produce other functions. One of the
most useful of these is the composition operator.
Given two functions f and g, the composition of f
and g is represented mathematically as
f ⋅ g and defined
as the function h where
h(x) = f(g(x)).
In Haskell, the composition operator is written using the period (.
),
but otherwise it's the same.
f . g = h where h x = f (g x)  as defined in the Prelude
The composition operator is very high in the precedence hierarchy — just below function application, above the arithmetic operators.
We can use the composition operator to shorten our definition
of threshold
yet further. First, we'll add parentheses to
our second definition from above to make the currying clear.
threshold z f x = (min z) (f x)
Here we have something that looks very similar to the
h
returned by function composition:
We have _ (_ x)
. This is the same as
(_ . _) x
.
threshold z f x = ((min z) . f) x
Finally, since the final parameter is simply passed into another function, we can take it off both sides to arrive at:
threshold z f = (min z) . f
And this makes sense: Given a value x, we'll first apply f to it to arrive at f(x), and then we'll pass that as the second argument to min to get the thresholded value.
It's an interesting diversion to extend the process further to
remove the other parameters to threshold
— though it's
certainly just a diversion, since it certainly doesn't clarify
the function's purpose.
First, we'll convert the usage of the composition operator to
prefix notation.
threshold z f = (.) (min z) f
Now because (.)
is curried, we can observe that this is
the same as
threshold z f = ((.) (min z)) f
Now we're in the same situation as earlier — the final parameter is simply passed into a function — and so we can strip off the final parameter.
threshold z = (.) (min z)
To get rid of the z parameter, we simply observe that this definition itself looks like function composition: We apply min to z, and then we apply the function composition operator to the result. So we can rewrite the above as
threshold z = ((.) . min) z
And now the final parameter is simply passed into another function, so we can strip off that parameter too.
threshold = (.) . min
Writing threshold
this way would get you no points for
readability, but you could feel pride in being brief!
I hasten to add that there are some important limits on what you can do with functions: You can't test two functions to see if they're equal — i.e., if they work the same. In fact, determining this is mathematically impossible (in the same sense that the halting problem is uncomputable). Also, functions can't be converted into a reasonable string representation, and so any attempts to display them will fail. But Haskell regards these operations as library features, not a feature of the base language, and in terms of the base language, we can do anything with functions that we can with other data.
Many imperative languages, by contrast, do not give functions
such a firstclass status.
If you're very familiar with C,
you might object that it has a function pointer feature,
allowing a function to take a
function pointer as a parameter, to make a call to the function
referred by that function pointer within the function, and to
return a function pointer. But C's facility, though nice, isn't
as strong as Haskell's, for a function pointer can only refer to
a preexisting function. In C, you can't write a function like
threshold
that would take one function as input and
produce another function as output.
(However, there are a reasonable number of exceptions to the rule:
Smalltalk, JavaScript, and Python all have flexibility of
function handling that nears the expressiveness of Haskell.)
In our function types thus far, we've worked with specific
types. But Haskell also has a feature where functions can work
with arbitrary types. Consider the following translate
function which takes a function f and a distance d and
returns a new function g that is f translated
d units
to the right.
translate f d = g where g x = f (x  d)
One way to define the type of this function is to restrict f
to be a mapping from Double
s to Double
s.
translate :: (Double > Double) > Double > (Double > Double)
But this is not as general as possible. It's true that f must take a number as a parameter — after all, we pass the result of a subtraction into f — but its return value need not be a number. We might also want to translate a function that maps numbers to Booleans — that essentially identifies a subset of the real number line, like the set of numbers x for which sin x < 0.5. But the type declaration as written above would disallow this.
Enter polymorphic functions, where we can have type
variables
(there's that nasty word again) that represent an
abstract type to based on how the function is actually used.
In Haskell, each type variable is generally a lowercase letter.
translate :: (Double > a) > Double > (Double > a)
In this case, we're saying to let a be whatever the return type of the
first parameter function is; the final return value would be a
function from Double
s to a's. With this type declaration (or
with the type that would be inferred were we to omit the type
declaration), we could use translate to translate any
function taking a Double
as a parameter — whether the return
value would be a Double
, a Bool
, or even another function.
The composition function (.)
is an interesting case to
study with polymorphic types.
Recall that we defined it as the following.
f . g = h where h x = f (g x)  as defined in the Prelude
Now what's the most general type we could assign to this function?
We'll start with x, the parameter to h, the return
value, and assign it a type variable a
.
Notice that x is passed to g also, so g must also take a
parameter of type a
.
Let's suppose that g returns a value of type b
(which may or may not be the same as a
).
To compute h, the return value of f is passed into f, so
f's parameter type must be b
.
Whatever f returns — call it c
— is likewise
returned by h. This leads to the following type declaration.
(.) :: (b > c) > (a > b) > (a > c)
This is a pretty general type!