2013-03-18
We will use the Glasgow Haskell Compiler (GHC).
There are a couple of Haskell implementations, but GHC is the go-to implementation for stability and support for language extensions.
Most of the Haskell research and development is centered around GHC.
The current stable version of GHC is version 7.6.2 (released in January 2013); The next major stable version 7.8.1 of GHC will most likely be released this Summer.
Stable major GHC versions are released once a year currently (usually around ICFP).
Information about other Haskell implementations can be found at http://www.haskell.org/haskellwiki/Implementations
GHC-based common platform for using and developing applications in Haskell.
Attempt to reproduce Python's "Batteries Included" principle.
Currently available for Linux 32/64-bit, OSX 32/64-bit, and Windows 32-bit.
Current release 2012.4 still based on old GHC 7.4.2; Release 2013.2 (due in May 2013) will be GHC 7.6.2 based.
Some Linux distributions as Debian/Ubuntu have it in their repositories: apt-get install haskell-platform
Get it at http://www.haskell.org/platform
Nice-to-have features: Integration with GHCi REPL, tab-completion, documentation lookup
The majority of Haskell developers seem to prefer a programmer-friendly text-editor such as Emacs or Vim.
ghc --make -O1 -Wall program.hs
will compile a Haskell program into an executable program
with additional compile-time warnings and a basic optimization level.runghc program.hs
will run the Haskell program by interpretation.Or you can load it up in the GHCi REPL:
$ ghci ./program.hs
GHCi, version 7.6.2: http://www.haskell.org/ghc/ :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
[1 of 1] Compiling Main ( program.hs, interpreted )
Ok, modules loaded: Main.
*Main> :main
Hello World!
*Main> _
Note: if you don't see the interpreted
keyword for a module, you might have a compiled module lying around.
Haskell is a language with non-strict semantics (however, lazy evaluation is a popular implementation method to meet the required semantics).
Stated as part of the primary goals in the Preface:
"It should be suitable for teaching, research, and applications, including building large systems."
Defines the basic Haskell language, the Haskell standard library, and the Foreign Function Interace (FFI).
Pragma directive for language extensions
Current version at http://www.haskell.org/onlinereport/haskell2010
Integrates a couple of widely used language extensions such as:
Data.List
)LANGUAGE
Pragmadata Void
)...but also support for some misfeatures is removed/deprecated:
The (n + k) pattern syntax
factorial 0 = 1
factorial (n+1) = (n+1) * factorial n
Datatype context, e.g.:
data Eq a => Set a = Set [a]
The following simple function already shows many language properties:
filter :: (a -> Bool) -> [a] -> [a]
filter p [] = []
filter p (x:xs)
| p x = x : filter p xs
| otherwise = filter p xs
Definable by infixl
, infix
and infixr
declarations
Predefined fixities for standard operators:
infixl 9 !! -- default fixity
infixr 9 .
infixr 8 ^, ^^, ⋆⋆
infixl 7 ⋆, /, `quot`, `rem`, `div`, `mod`
infixl 6 +, -
infixr 5 ++, :
infix 4 ==, /=, <, <=, >=, >, `elem`, `notElem`
infixr 3 &&
infixr 2 ||
infixl 1 >>, >>=
infixr 1 =<<
infixr 0 $, $!, `seq`
Function application binds stronger than level 9
User-defined fixities can be useful in EDSL-like situations, but careless abuse can lead to confusing code easily.
Every value/expression has a type!
()
- "unit" or "0-ary tuple"Bool
- boolean valueChar
- Unicode code-point[Char]
- aka String
- avoid like the plague!Int
- bounded integerInteger
- big-num integerDouble
- IEEE double-precision floating-point numberInt->Bool
- A function from Int
to Bool
The Bool
type definition is nothing special:
data Bool = False | True
deriving (Eq,Ord,Read,Show,Enum,Bounded)
But the Haskell report depends on the Bool
type for some of its definitions (e.g. the if/else
expression).
Some types can't be user defined, e.g. the list type has a special syntax:
-- not valid Haskell98
data [] a = [] | a : [a]
deriving (Eq,Ord,Read,Show)
infixr 5 :
Enumeration types such as Char
which provide literal syntax are special too:
-- pseudo code!
data Char = '\NUL' | ... | ' ' | '!' | '"' | ... | 'A' | ...
deriving (Eq,Ord,Read,Show,Enum,Bounded)
Expressions are said to be in Weak Head Normal Form (WHNF) if the outermost constructor of an expression is "known".
In the WHNF examples below, ⊥
may represent any unevaluated sub-expression aka thunk, including bottom:
(1,True)
True
(⊥,⊥)
[]
(⊥:⊥)
[⊥,1,2,3]
\x -> ⊥ -- the lambda-abstraction "constructor"
filter odd -- partially applied functions
Pattern-matching and seq
are the primitive operations that can "force" a value into WHNF (or a divergence ⊥)
In H98 every type is implicitly inhabited by the special value ⊥ ("bottom").
⊥ :: a
is the only value with an universal unconstrained type.
This is different from a Null-pointer: You can't test if a value is ⊥ in H98.
The value ⊥ represents error values which are technically not distinguishable from non-termination.
The standard Prelude defines the following two functions for creating bottom-values on purpose:
error :: String -> a
undefined :: a
When a ⊥ is forced to WHNF during execution the Haskell program terminates!
A total function may only return ⊥ if its input arguments contain ⊥.
A partial function is a function that is not a total function.
Some partial functions from the standard Prelude:
head :: [a] -> a
maximum :: Ord a => [a] -> a
(!!) :: [a] -> Int -> a
read :: Read a => String -> a
Non-exhaustive pattern matches are primitive source of partial functions, e.g.:
head (x:_) = x
Partial functions should be avoided whenever possible as they are hidden mines waiting to blow up. However, we can't avoid partial functions completely, as this would restrict us to programs which are provably terminating.
seq
The special seq
function is defined as
seq :: a -> b -> b
⊥ `seq` b = ⊥
a `seq` b = b
In practice, this causes a
to be reduced to WHNF when seq a b
is reduced to WHNF as a "side-effect".
Note: seq
breaks η-conversion, i.e. it makes \x → ⊥ x
distinguishable from ⊥
Prelude also defines $!
(c.f. $
) operator for simulating strict function application:
($!) :: (a -> b) -> a -> b
f $! x = x `seq` f x
infixr 0 $!
type
- for defining auch type synonymsdata
- for defining fully fledged algebraic types, i.e. multiple constructors and fieldsnewtype
- combination of type
and data
; similiar to single-constructor/field data
type but with less overheadTwo different namespaces: Type constructors & Data constructors
data
as well as newtype
allow for labelled fields:
data C = F { f1,f2 :: Int, f3 :: Bool }
f1 :: C -> Int
Labelled fields provide special record update syntax:
foo = F 1 2 3
doo = F { f2 == 33 } -- some fields are undefined!
bar = foo { f2 = 22 }
Warning: Labelled fields can lead to partial functions
Haskell 98 has support for making data fields strict by prepending !
to fields, e.g.
Data C = F Int !Int Bool
The strictness annotation has only an effect for value construction.
The semantics for the value constructors are roughly translated into
F a b c = b `seq` F' a b c -- pseudo-code!
where F'
represents the corresponding primitive value constructor (without strictness).
Kinds are to type expressions what types are to value expressions.
Haskell 98 defines the following kinds:
*
- the primitive kind of nullary type constructors, e.g. Int
or Maybe Char
* → *
- the kind of unsaturaed type constructors which need exactly one type, e.g. Maybe
or []
→
k2 - generally, if k1 and k2 are kinds, then k1 →
k2 is the kind of types that take a type of kind k1 and yield a type of kind k2Language extensions exist in GHC to allow to specify "kind signatures", introduce new primitive kinds beyond *
, or support kind polymorphism.
Which valid implementations are there for the following type-signature?
foo :: (a,b) -> a
foo = ...
Unconstrained type-variables provide the least possible information about a type, because type-variables are universally quantified over all concrete types.
Type-classes provide ad-hoc polymorphism and more information about what can be done with an instance of now constrained type-variable.
Type-classes define type-indexed dictionaries of associated functions/values, thus no value is required for dispatch:
foo :: Int
foo = maxBound
Each additional typeclass constraint gets translated into an implicit parameter for the typeclass dictionary, e.g. the following (notice the bug, btw?):
eqList :: (Eq a) => [a] -> [a] -> Bool
eqList x y = and $ zipWith (==) x y
gets translated into something like
data EqDict a = EqDict { isEq, isNEq :: a->a->Bool }
eqList :: EqDict a -> [a] -> [a] -> Bool
eqList eqDict x y = and $ zipWith (isEq eqDict) x y
Often the compiler is able to inline and optimize the type-class dictionary away.
Haskell 98 supports automatic derivation of instances for Eq
, Ord
, Enum
, Bounded
, Show
, and Read
.
NFData
seq
only provides WHNF reduction, i.e. "shallow" NF
Let's define an operation deepseq
in the style of seq
but which reduces its first argument to NF instead of just WHNF.
deepseq :: a -> b -> b
Alas, there's no way to make it work in H98 with that type-signature.
Typeclasses to the rescue:
class NFData a where
deepseq :: a -> b -> b
Note: This diverges from the actual definition of the real NFData
class
NFData
(cont.)Instances for primitive types can be defined in terms of seq
:
instance NFData Int where deepseq = seq
instance NFData Integer where deepseq = seq
instance NFData Char where deepseq = seq
instance NFData Bool where deepseq = seq
instance NFData Double where deepseq = seq
instance NFData () where deepseq = seq
...
And now for the instances where NF differs from WHNF:
instance NFData a => NFData (Maybe a) where
deepseq Nothing y = y
deepseq (Just x) y = x `deepseq` y
instance (NFData a, NFData b) => NFData (a,b) where
deepseq (x1,x2) y = x1 `deepseq` x2 `deepseq` y
instance NFData a => NFData [a] where
deepseq [] y = y
deepseq (x:xs) y = x `deepseq` xs `deepseq` y
...
Haskell provides a module system that allows to group definitions and declarations into distinct namespaces.
It is possible to restrict the visible names by explicit export lists, allowing e.g. for ADTs by exporting only the type but not the primitive value constructors. Notable exception: Typeclass instances can't be hidden (unless the class is hidden as well).
Other modules can be made visible in the current module scope by import statements, which support qualified/unqalified imports, and/or explicitly naming the names to be imported.
Modules can re-export other modules to help flatten a huge module hierarchy and reduce the number of needed import statements for users of the library.
A Haskell file missing the module
header gets prepended implicitly:
module Main(main) where
Simple example showing some features:
Module Foo.Bar (module Foo.Bar.Doo, head, f1, f2,
T1, T2(C1), T3(..)) where
import Foo.Bar.Doo -- re-exported as-is
import Foo.Bar.Zoo (f1) -- f1 is re-exported
import qualified Data.Bar.Zoo as Zoo
import Prelude hiding (head)
data T1 = HiddenConstr -- only T1 is exported
data T2 = C1 | C2 -- only T2 and C1 is exported
data T3 = D1 | D2 | D3 -- complete type exported
f2 x = Zoo.f2 $! x
head :: [a] -> Maybe a
head [] = Nothing
head (x:_) = x
So far, we've only talked about pure code.
"If a tree falls in a forest and no one is around to hear it, does it make a sound?"
We need observable effects, such as interacting with the world.
We need an observer: The main :: IO ()
function of the Main
module provide the entry point.
Smallest valid trivial H98 program (but no effects):
main = return ()
And here's the canonical hello-world program in H98:
main = putStrLn "Hello World!"
do
NotationThe do
notation is just syntax sugar that gets translated to applications of the typeclass methods >>
, >>=
, fail
, and some use of lambda abstractions. For instance,
main = do putStr "What is your first name? "
first <- getLine
putStr "And your last name? "
last <- getLine
let full = first++" "++last
putStrLn ("Pleased to meet you, "++full++"!")
can be translated into
main = putStr "What is your first name? " >>
getLine >>= \first ->
putStr "And your last name? " >>
getLine >>= \last ->
let full = first++" "++last in
putStrLn ("Pleased to meet you, "++full++"!")
Btw, x >> y
is just a shortcut for x >>= \_ -> y
IO
typeMonad
typeclass (of which IO
happens to be an instance), except for Haskell providing the do
syntax sugar.The really magic thing is the IO
type (with kind * → *
), which happens to be an instance of the Monad
typeclass.
The IO a
type can be imagined as representing:
type IO a = World -> (a, World)
That strange World
represents the state of the entire world!
IO
type (cont.)Consider the two basic I/O functions:
getChar :: IO Char
putChar :: Char -> IO ()
To combine them, we need a sequencing combinator
(>>=) :: IO a -> (a -> IO b) -> IO b
IO
type (cont.)Figuratively speaking, main
is provided with an initial World
-state and then the runtime-system forces the evaluation of the final World
-state.
A value of type IO a
is an action that when performed or run yields some result (of type a
) and maybe some side-effect. Forcing a value of type IO a
to WHNF (e.g. via seq
) just acts at the glueing-level (i.e. forcing the >>=
composition to get evaluated).
This a convenient abstraction to integrate side-effects into a purely functional language such as Haskell with lazy evaluation.
Important property: Impure computations such as I/O have to be declared explicitly at the type-level.
Into which language subset can Haskell 98 be desugared?
f x
\x → e
case x of { p1 → e1; ... }
data T = C1 | C2 T2
main = ...
The Haskell Language Report refers to a
For instance, the if/else
expression
if e1 then e2 else e3
is simplified into...
For instance, the if/else
expression
if e1 then e2 else e3
is simplified into
case e1 of { True -> e2 ; False -> e3 }
An infix function definition such as
True || _ = True
_ || True = True
_ || _ = False
can be reduced to
(||) = \x1 x2 -> case (x1,x2) of { (True,_) -> True
; (_,True) -> True
; (_,_ ) -> False
}
Later on, we'll discuss a similiar minimal language called "GHC Core".