HackerRank - Expressions V2

May 3, 2015

I’m writing this post to talk about my Haskell solution to a HackerRank problem called Expressions V2. The problem asks you to write an interpreter for a simple calculator language. Furthermore, you are required to do the math modulo $10^9+7$ .

Essentially, there are two parts to this problem. The first is parsing the expression(s). The second is evaluating the parsed expression. I’m going to talk about each part independently. I’ll provide a link to my complete solution at the end.

Parsing

The grammar for the “language” was given in the problem as

Expression ::= Term [+-] Expression
            | Term

Term       ::= Factor [*/] Term
            | Factor

Factor     ::= Number
            | [+-] Factor
            | '(' Expression ')'

The Parsec Library

To parse this grammar, I decided to learn and use Parsec, a parser combinator library. Of course, it shouldn’t have been too hard to hand roll a parser for such a simple grammar, but I wanted to learn how to do it The Right Way™. Not to mention, I had looked into a paper about parser combinators and heard about Parsec already and it sounded pretty cool.

Being a parser combinator library, you make more complex parsers by combining simpler ones. Because the parsers are both applicatives and monads, you can combine them using either style. The monad style can be done using do-expressions which makes the code very readable. However, I decided after reading about it a bit to use the applicative style both for practice and because some people feel it’s more clear. The applicative style looks like this:

number = Num . read <$> many1 digit <* many (char ' ')

Parser Types

My first look was a bit discouraging since the types for the parsers were complex and I was not particularly knowledgeable about the various types such as monads, the state monad, and monad transformers. (I’m no expert now, but they’re not quite as scary anymore.) However, after some experiments in GHCi to sort out the types and how to run a parser, I felt a bit better.

The general type for the Parsec monad is:

data ParsecT s u m a

The ‘T’ at the end indicates that it is a monad transformer. The s is the stream type, which in my case was String; however, it could be a stream of tokens if I had tokenized the input beforehand. The u represents the ‘user data’ type, however, I chose to ignore this part. The m is the monad being wrapped by the ParsecT instance, and finally a is the return type of the ParsecT monad.

While this more general type provides a lot of power, it’s also a lot more complicated than I needed. The library also defines a Parsec type that sets the underlying monad to Identity.

type Parsec s u = ParsecT s u Identity

However, since I was working with Strings, I used the types in Text.Parsec.String, which defines:

type GenParser tok st = Parsec [tok] st
type Parser = Parsec String ()

GenParser reifies the stream to be a list of some tok type. Parser further reifies the stream to be a String ([Char]) and sets the state to () since we don’t need it. Because I wasn’t really clear on the types at the time, I went with GenParser and specified the stream token type as Char and the state as () myself.

Expression Data

Here you can see from the type of my top level parsing function:

    parseExpr :: GenParser Char st Expr

I wrote the parser to return data of type Expr, which I’ve defined below:

type UnaryOp = Integer -> Integer
type BinOp = Integer -> Integer -> Integer

data Expr = Binary BinOp Expr Expr
          | Unary UnaryOp Expr
          | Num Integer

I built up these Exprs as I parsed the input. When I encountered an operator, I would call either Binary or Unary on the Haskell function that the operator represented. Except for /, it was the operator you would expect; / was special because modular division is special. You can see the code for this below:

convert :: Char -> (Expr -> Expr -> Expr)
convert '+' = Binary (+)
convert '-' = Binary (-)
convert '*' = Binary (*)
convert '/' = Binary (\ a b -> a * modpow b (p-2) p)

and

convert :: Char -> (Expr -> Expr)
convert '+' = Unary id
convert '-' = Unary negate

First, note that convert’s type is different between Unary and Binary; because each is defined in different where blocks, this is okay. Also note that Binary and Unary are only partially applied. This means that convert returns a function that still expects 1 or 2 more Exprs (which makes sense given that a unary operation needs 1 operand and a binary operation needs 2).

Similarly, Num was called on numbers as they were encountered:

number = Num . read <$> many1 digit <* many (char ' ')

The remaining Exprs were filled in by either using the chainr1 function, such as below:

parseExpr = parseTerm `chainr1` parseAddSub

or with the code below:

unary = convert <$> oneOf "+-" <*> parseFactor
parenthetical = between (symbol "(") (symbol ")") parseExpr

Evaluation

Once I had my Exprs, the second part of the problem was evaluating them. This turned out to be pretty simple. The evaluation was done in the eval function:

eval :: Expr -> Integer
eval (Num n) = n
eval (Binary op e1 e2) = eval e1 `op` eval e2 `mod` (1000000000+7)
eval (Unary op e1) = op (eval e1)

Notice that each evaluation does the operation mod $10^9+7$ .

There is only one extra aspect of note. Doing modular division is trickier than the other operations. The problem states this about modular division:

$10^9 + 7 \text{ is a prime}$ .
$1/b \equiv b^{−1} \equiv b^{p−2} (mod\ p), \text{ where p is prime and } b < p$ .

Thus, my division operation looks like this:

convert '/' = Binary (\ a b -> a * modpow b (p-2) p)
p = ((10^9 :: Integer) + 7) :: Integer

where modpow is modular exponentiation. To keep things speedy, I used fast modular exponentiation. Below is the definition of modpow:

modpow :: Integer -> Integer -> Integer -> Integer
modpow _ 0 _ = 1
modpow b e m = modpow (b*b `mod` m) (e `div` 2) m * (if e `mod` 2 == 1 then b else 1) `mod` m

Putting It All Together

My main function was actually pretty simple too. It simply read in the expression to be parsed, parsed it and evaluated it. You can see the code below:

main :: IO ()
main = do
    input <- getContents
    case parse parseExpr "" input of
      Left err -> print err
      Right expr -> print (eval expr)

Note that because the parser may fail, it returns an Either which I have to match against.

Pretty simple, right? =)

Now, finally, here is the entire solution including some extra debugging code:

Thomas Manning

Making Connections