2013-04-29
Prelude.head: empty list
exception.foo:11:5-17: Irrefutable pattern failed for pattern ...
Stack traces easy in strict/imperative languages, which maintain an explicit call stack.
Obstacles in Haskell:
Recent GHCs supports printing stack traces in their default exception handler via +RTS -xc
based on cost-centres.
Needs to compile with -prof
enabled and some appropriate -fprof-*
flags.
Use -fno-prof-count-entries
to avoid some of the profiling overhead.
Using different -O
-levels can make the resulting stack trace more or less confusing.
Not available in GHCi yet.
Stack trace can be emitted without exceptions (and w/o +RTS -xc
) by using Debug.Trace.traceStack
GHC 7.8.x will have the following helper defined in GHC.Stack
(code works with GHC 7.6.x as well):
-- | Like the function 'error', but appends a stack trace
-- to the error message if one is available.
errorWithStackTrace :: String -> a
errorWithStackTrace x = unsafeDupablePerformIO $ do
stack <- ccsToStrings =<< getCurrentCCS x
if null stack
then throwIO (ErrorCall x)
else throwIO (ErrorCall (x ++ '\n' : renderStack stack))
Consider the following simple example:
main = do
[x] <- fmap (map read) getArgs
print (head (f x))
f :: Int -> [Int]
f x = map g [ x .. x+10 ]
g :: Int -> Int
g x = 100 `div` x
compiled with --make -O -prof -fprof-auto
and called with 0 -RTS -xc
this causes a division-by-zero exception:
*** Exception (reporting due to +RTS -xc): (THUNK_2_0), stack trace:
GHC.Err.CAF
--> evaluated by: Main.g,
called from Main.f,
called from Main.main,
stack2: divide by zero
This provides a valuable starting point for further debugging.
Further (technical) information:
HIW 2012. Simon Marlow: Why can't I get a stack trace?
printf
-style Debuggingalso called trace debugging
"ancient" form of debugging
sometimes even useful when a proper debugger is available
basic idea: annotate code with statements which log/output information allowing to infer state in program flow.
printf
-style Debugging (cont.)Trivial in Haskell when the IO
monad is accessible.
But what about pure computations?
unsafePerformIO
loop-hole!unsafePerformIO
Recall that the type-systems forbids you to perform side-effects from pure computations.
However, there's a loophole provided: unsafePerformIO
(or unsafeLocalState
in H2010):
unsafePerformIO :: IO a -> a
unsafeDupablePerformIO :: IO a -> a
Primary use: For wrapping FFI functions which behave as if pure (but read/write to "local" memory).
Proof-obligation of purity is delegated to programmer when using unsafePerformIO
(hence the unsafe
).
unsafePerformIO
(cont.)Failing to ensure purity of unsafePerformIO
-wrapped actions leads to very hard to debug bugs.
Don't use unsafePerformIO
unless really necessary (e.g. for FFI) in production code.
However, when debugging unsafePerformIO
may provide a quick'n'dirty tool for triggering side-effects from pure computations.
trace :: String -> a -> a
trace msg val = unsafePerformIO $ do
putStrLn msg
return val
printf
-style Debugging (cont.)GHC provides some helpers in the Debug.Trace
module:
trace :: String -> a -> a
traceShow :: Show a => a -> b -> b
traceStack :: String -> a -> a
traceIO :: String -> IO ()
trace
only triggers side-effect if its result is forced to WHNF.
Useful usage pattern:
someFunc x y z | traceShow (x,y,z) False = undefined
someFunc x y z = ...original definition...
GHCi allows to set breakpoints for interpreted modules:
:break [<mod>] <l> [<col>] set a breakpoint at the specified location
:break <name> set a breakpoint on the specified function
:step single-step after stopping at a breakpoint
:step <expr> single-step into <expr>
:steplocal single-step within the current top-level binding
:stepmodule single-step restricted to the current module
Example session:
qsort [] = []
qsort (a:as) = qsort left ++ [a] ++ qsort right
where (left,right) = (filter (<=a) as, filter (>a) as)
main = print (qsort [8, 4, 0, 3, 1, 23, 11, 18])
Start up GHCi with ghci qsort.hs
:
*Main> :break 2
Breakpoint 0 activated at qsort.hs:2:16-47
*Main> main
Stopped at qsort.hs:2:16-47
_result :: [Integer] = _
a :: Integer = 8
left :: [Integer] = _
right :: [Integer] = _
[qsort.hs:2:16-47] *Main> :list
1 qsort [] = []
2 qsort (a:as) = qsort left ++ [a] ++ qsort right
3 where (left,right) = (filter (<=a) as, filter (>a) as)
More information: The GHCi Debugger chapter in the GHC User's Guide.
cabal install criterion
import Data.Word
import Criterion.Main
fib 0 = 0
fib 1 = 1
fib n = fib (n-1) + fib (n-2)
main = defaultMain [
bgroup "fib" [ bench "10::Int" $ whnf fib (10::Int)
, bench "10::Integer" $ whnf fib (10::Integer)
, bench "10::Word" $ whnf fib (10::Word)
]
]
In order to re-evaluate pure expressions multiple times, the following helpers exist:
whnf
: takes a function and an argument and force result to WHNFnf
: takes a function and an argument and force result to NFwhnfIO
: perform action and force result to WHNFnfIO
: perform action and force result to NFbgroup
and bench
exist for structuring the benchmarks.
Using defaultMain
adds a CLI (try --help
); default output:
estimating clock resolution...
mean is 1.011553 us (640001 iterations)
found 1138453 outliers among 639999 samples (177.9%)
509291 (79.6%) low severe
629162 (98.3%) high severe
estimating cost of a clock call...
mean is 25.23033 ns (7 iterations)
benchmarking fib/10::Int
mean: 916.7958 ns, lb 916.1599 ns, ub 917.9469 ns, ci 0.950
std dev: 4.270614 ns, lb 2.751673 ns, ub 6.970843 ns, ci 0.950
benchmarking fib/10::Integer
mean: 2.841818 us, lb 2.840226 us, ub 2.844350 us, ci 0.950
std dev: 10.10157 ns, lb 7.248909 ns, ub 17.30419 ns, ci 0.950
…
Criterion measures clock resolution and cost in order to improve accuracy of benchmark measurements.
Use -o FILENAME
to generate a HTML report with plots.
For finding out where a program spends of most of its time we need time profiling.
C.f. heap profiling for finding out what memory was allocated for.
-prof
--enable-{library,executable}-profiling
Accounting based on cost-centres.
A cost is simply the time or space required to evaluate an expression.
Stack of enclosing cost centres is maintained for any given expression at run-time (→ call tree).
Expressions can be manually annotated as cost-centres via {-# SCC … #-}
pragmas.
All costs incurred by the annotated expression are assigned to the enclosing cost centre (stack).
Tools like prof2pretty
can annotate hot-spots in source code.
Instead of manually defining all cost-centres via {-# SCC … #-}
one can use -fprof-*
flags as well:
-fprof-auto
: Auto-add SCCs to all bindings not marked INLINE-fprof-auto-top
: Auto-add SCCs to all top-level bindings not marked INLINE-fprof-auto-exported
: Auto-add SCCs to all exported bindings not marked INLINE-fprof-cafs
: Auto-add SCCs to all CAFsFull description in GHC Manual section about cost-centres
How to use:
Compile with profiling and cost-centres:
ghc --make -O -rtsopts -prof -fprof-auto fib.hs
The RTS provides the following profiling +RTS
-options:
-p Time/allocation profile (output file <program>.prof)
-P More detailed Time/Allocation profile
-Pa Give information about *all* cost centres
Run compiled program with e.g.
./fib +RTS -p
...and inspect resulting fib.prof
file.
Example showing usefulness of cost-centre stacks:
main = print (f 25 + g 25)
f n = fib n
g n = fib (n `div` 2)
fib n = if n < 2 then 1 else fib (n-1) + fib (n-2)
COST CENTRE MODULE no. entries %time %alloc %time %alloc
MAIN MAIN 102 0 0.0 0.0 100.0 100.0
CAF GHC.IO.Handle.FD 128 0 0.0 0.0 0.0 0.0
CAF GHC.IO.Encoding.Iconv 120 0 0.0 0.0 0.0 0.0
CAF GHC.Conc.Signal 110 0 0.0 0.0 0.0 0.0
CAF Main 108 0 0.0 0.0 100.0 100.0
main Main 204 1 0.0 0.0 100.0 100.0
main.g Main 207 1 0.0 0.0 0.0 0.1
fib Main 208 1973 0.0 0.1 0.0 0.1
main.f Main 205 1 0.0 0.0 100.0 99.9
fib Main 206 2692537 100.0 99.9 100.0 99.9
Previously, we used the non-profiling-mode -h(T)
heap profile generation, which classifies by closure-type.
When compiling with -prof
more heap profile variants become available, which break down…
-h(c)
: …by cost-centre stack-hm
: …by module-hd
: …by closure description-hy
: …by type description-hr
: …by retainer (see user's guide)-hb
: …by biography (see user's guide)Sometimes we want/need to interface with non-Haskell code, e.g. for
Currently, only a FFI for C is properly implemented.
Simple example for a pure function:
import Foreign.C -- for CDouble
foreign import ccall unsafe "math.h sin" c_sin :: CDouble -> CDouble
sin :: Double -> Double
sin = realToFrac . c_sin . realToFrac
ccall
defines c calling convention (there's stdcall
for Win32).
unsafe
specifies that the called function won't call back into Haskell.
CDouble
is a foreign type which corresponds exactly to C's double
type.
c_
" prefix."math.h &sin"
) are conventionally prefixed with "p_
".math.h
is optional (and currently ignored by GHC for the ccall
convention)What's wrong with
foreign import ccall unsafe "stdlib.h rand" c_rand :: CUInt
rand()
is not a real functionHere it's done right:
foreign import ccall unsafe "stdlib.h rand"
c_rand :: IO CUInt
foreign import ccall unsafe "stdlib.h srand"
c_srand :: CUInt -> IO ()
Imported C functions which inspect pointers or have internal state have to be marked as being impure by putting them in the IO
monad:
foreign import ccall unsafe "string.h memchr"
c_memchr :: Ptr Word8 -> CInt -> CSize -> IO (Ptr Word8)
However, if you can "contain" the side effect, you can use unsafePerformIO
to provide a pure Haskell wrapper.
Derived from Data.ByteString
:
foreign import ccall unsafe "string.h memchr"
c_memchr :: Ptr Word8 -> CInt -> CSize -> IO (Ptr Word8)
elemIndex :: Word8 -> ByteString -> Maybe Int
elemIndex c (PS x s l) = unsafePerformIO $ withForeignPtr x $ \p -> do
let p' = p `plusPtr` s
q <- c_memchr p' c (fromIntegral l)
return $! if q == nullPtr then Nothing else Just $! q `minusPtr` p'
Sometimes C functions allocate new memory which the caller owns and has to deallocate lateron.
But in Haskell computations might be cancelled at any time which would lead to memory leaks.
The ForeignPtr
facility provides support for integrating with Haskell's GC:
type FinalizerPtr a = FunPtr (Ptr a -> IO ())
newForeignPtr :: FinalizerPtr a -> Ptr a -> IO (ForeignPtr a)
Assume the following stupid C API:
typedef struct foo_s foo_t;
foo_t *foo_new(int);
void foo_free(foo_t *);
where foo_new()
constructs (allocates) a new foo_t
object, and foo_free()
destructs (deallocates) it again.
On the Haskell side, we'd wrap it as follows:
data Foo -- phantom type
foreign import ccall unsafe "foo_new"
c_foo_new :: CInt -> IO (Ptr Foo)
foreign import ccall unsafe "&foo_free"
p_foo_free :: FunPtr (Ptr Foo -> IO ())
mkFoo :: Int -> ForeignPtr Foo
mkFoo i = unsafePerformIO $
newForeignPtr p_foo_free =<< foo_new (fromIntegral i)
{-# NOINLINE mkFoo #-}
Note that mkFoo
provides a pure function API to an impure C API.