Skip to content

Commit

Permalink
Start on week 3 notes.
Browse files Browse the repository at this point in the history
  • Loading branch information
athas committed Aug 14, 2024
1 parent d7f0244 commit 0cff51b
Show file tree
Hide file tree
Showing 3 changed files with 202 additions and 0 deletions.
91 changes: 91 additions & 0 deletions haskell/QuasiParsec.hs
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
-- A simple parsing library that imitates the naming of the Megaparsec
-- API, but is implemented differently. In particular, this is a
-- backtracking-based parser, because this is easier to explain.
module QuasiParsec
( -- Parser interface
Parser,
parse,
-- Primitive parsers
satisfy,
chunk,
notFollowedBy,
choice,
eof,
-- Parser combinators
many,
some,
)
where

import Control.Monad (ap)

-- A parser of things
-- is a function from strings
-- to a list of pairs
-- of things and strings
newtype Parser a = Parser {runParser :: String -> [(a, String)]}

parse :: Parser a -> String -> [(a, String)]
parse (Parser f) s = f s

instance Functor Parser where
fmap f x = do
x' <- x
pure $ f x'

instance Applicative Parser where
(<*>) = ap
pure x = Parser $ \s -> [(x, s)]

instance Monad Parser where
Parser f >>= g = Parser $ \s ->
concatMap
( \(x, s') ->
let Parser g' = g x
in g' s'
)
(f s)

instance MonadFail Parser where
fail _ = Parser $ \_ -> []

-- Primitive parsers

satisfy :: (Char -> Bool) -> Parser Char
satisfy p = Parser f
where
f [] = []
f (c : cs) =
if p c
then [(c, cs)]
else []

notFollowedBy :: Parser a -> Parser ()
notFollowedBy (Parser f) = Parser $ \s ->
case f s of
[] -> [((), s)]
_ -> []

choice :: [Parser a] -> Parser a
choice ps = Parser $ \s -> concatMap (\p -> runParser p s) ps

chunk :: String -> Parser String
chunk = sequence . map (satisfy . (==))

eof :: Parser ()
eof = Parser $ \s ->
case s of
"" -> [((), "")]
_ -> []

-- Parser combinators

many :: Parser a -> Parser [a]
many p =
choice
[ (:) <$> p <*> many p,
pure []
]

some :: Parser a -> Parser [a]
some p = (:) <$> p <*> many p
23 changes: 23 additions & 0 deletions haskell/Week3.hs
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
-- Random code related to Week 3

import Data.Char (isDigit, ord)

charInteger :: Char -> Integer
charInteger c = toInteger $ ord c - ord '0'

readInteger :: String -> Integer
readInteger s = loop 1 $ reverse s
where
loop _ [] = 0
loop w (c : cs) = charInteger c * w + loop (w * 10) cs

readIntegerMaybe :: String -> Maybe Integer
readIntegerMaybe s = loop 1 $ reverse s
where
loop _ [] = Just 0
loop w (c : cs)
| isDigit c = do
x <- loop (w * 10) cs
pure $ charInteger c * w + x
| otherwise =
Nothing
88 changes: 88 additions & 0 deletions src/chapter_3.md
Original file line number Diff line number Diff line change
@@ -1 +1,89 @@
# Monadic and Applicative Parsing

It is sometimes the case that we have to operate on data that is not
already in the form of a richly typed Haskell value, but is instead
stored in a file or transmitted across the network in some serialised
format - usually in the form of a sequence of bytes or characters.
Although such situations are always regrettable, in this chapter we
shall see a flexible technique for making sense out of unstructured
data: *parser combinators*.

## Parsing Integers Robustly

To start with, consider turning a `String` of digit characters into
the corresponding `Integer`. That is, we wish to construct the
following function:

```Haskell
readInteger :: String -> Integer
```

The function `ord :: Char -> Int` from `Data.Char` can convert a
character into its corresponding numeric code. Exploiting the fact
that the integers have consecutive codes, we can write a function for
converting a digit character into its corresponding `Integer`. Note
that we have to convert the `Int` produce by `ord` into an `Integer`:

```Haskell
import Data.Char (ord)

charInteger :: Char -> Integer
charInteger c = toInteger $ ord c - ord '0'
```

Exploiting the property that the numeric characters are consecutively
encoded, we can implement `readInt` with a straightforward recursive
loop over the characters of the string, from right to left:

```Haskell
readInteger :: String -> Integer
readInteger s = loop 1 $ reverse s
where
loop _ [] = 0
loop w (c : cs) = charInteger c * w + loop (w * 10) cs
```

Example use:

```
> readInteger "123"
123
```

However, see what happens if we pass in invalid input:

```Haskell
λ> readInteger "xyz"
8004
```

Silently producing garbage on invalid input is usually considered poor
engineering. Instead, our function should return a `Maybe` type,
indicating invalid input by returning `Nothing`. This can be done by
using `isDigit` from `Data.Char` to check whether each character is a
proper digit:

```Haskell
readIntegerMaybe :: String -> Maybe Integer
readIntegerMaybe s = loop 1 $ reverse s
where
loop _ [] = Just 0
loop w (c : cs)
| isDigit c = do
x <- loop (w * 10) cs
pure $ charInteger c * w + x
| otherwise =
Nothing
```

Note how we are using the fact that `Maybe` is a monad to avoid
explicitly checking whether the recursive call to `loop` fails.

We now obtain the results we would expect:

```
> readIntegerMaybe "123"
Just 123
> readIntegerMaybe "xyz"
Nothing
```

0 comments on commit 0cff51b

Please sign in to comment.