Implementing Co, a Small Interpreted Language With Coroutines #1: The Parser

Many major programming languages these days support some lightweight concurrency primitives. The most recent popular ones are Goroutines in Go, Coroutines in Kotlin and Async in Rust. Let’s explore some of these concepts in detail by implementing a programming language with support for coroutines and Go-style channels.

Lightweight Concurrency

Lightweight concurrency has been a popular topic among programmers and programming language designers alike in recent times. Many languages created in the last decade have support for them either natively or using libraries. Some example are:

These examples differ in their implementation details but all of them enable programmers to run millions of tasks concurrently. This capability of being able to do multiple tasks at the same time is called Multitasking.

Multitasking can be of two types:

Coroutines@2 are computations that support cooperative multitasking1. Unlike ordinary Subroutines that execute from start to end and do not hold any state between invocations, coroutines can exit in the middle by calling other coroutines and may later resume at the same point. They also hold state between invocations. They do so by yielding the control of the current running thread so that some other coroutine can be run on the same thread2.

Coroutine implementations often come with support for Channels for inter-coroutine communication. One coroutine can send a message over a channel and another coroutine can receive the message from the same channel. Coroutines and channels together are an implementation of Communicating Sequential Processes (CSP)@5 a formal language for describing patterns of interaction in concurrent systems.

In this series of posts, we implement Co, a small interpreted dynamically-typed imperative programming language with support for coroutines and channels. Haskell is our choice of language to implement the Co interpreter.

Introducing Co

Co has these basic features that are found in many programming languages:

  • Dynamic and strong typing.
  • Null, boolean, string and integer literals and values.
  • Addition and subtraction arithmetic operations.
  • String concatenation operation.
  • Equality and inequality checks on booleans, strings and numbers.
  • Less-than and greater-than comparison operations on numbers.
  • Function declarations and calls.
  • First class functions.
  • Variable declarations, usage and assignments.
  • if and while statements.

It also has these special features:

  • yield statement to yield the current thread of computation (ToC).
  • spawn statement to start a new ToC.
  • First class channels with operators to send and receive values over them.

Let’s see some example code in Co for illustration:

// Fibonacci numbers 
// using a while loop
var a = 0;
var b = 1;
var j = 0;
while (j < 6) {
  print(a);
  var temp = a;
  a = b;
  b = temp + b;
  j = j + 1;
}

// Fibonacci numbers 
// using recursive function call
function fib(n, f) {
  if (n < 2) {
    return n;
  }
  return f(n - 2, f) 
    + f(n - 1, f);
}

var i = 0;
while (i < 6) {
  print(fib(i, fib));
  i = i + 1;
}

As you may have noticed, Co’s syntax is heavily inspired by JavaScript. The code example above computes and prints the first six Fibonacci numbers in two different ways and demonstrates a number of features of Co, including variable declarations and assignments, while loops, if conditions, and function declarations and calls3.

We can save the code in a file and run it with the Co interpreter4 to print this (correct) output:

0
1
1
2
3
5
0
1
1
2
3
5

The next example show the usage of coroutines in Co:

function printNums(start, end) {
  var i = start;
  while (i < end + 1) {
    print(i);
    yield;
    i = i + 1;
  }
}

spawn printNums(1, 4);
printNums(11, 15);

Running this code with the interpreter prints this output:

1
11
2
12
3
13
4
14
15

The printNum function prints numbers between the start and end arguments, but yields the ToC after each print. Notice how the prints are interleaved. This is because the two calls to the function printNums run concurrently in two separate coroutines.

The next example show the usage of channels in Co:

var chan = newChannel();

function player(name) {
  while (true) {
    var n = <- chan;
    if (n == "done") {
      print(name + " done");
      return;
    }
    
    print(name + " " + n);
    if (n > 0) {
      n - 1 -> chan;
    }
    if (n == 0) {
      print(name + " done");
      "done" -> chan;
      return;
    }
  }
}

spawn player("ping");
spawn player("pong");
10 -> chan;

This is the popular Ping-pong benchmark for communication between ToCs. Here we use channels and coroutines for the same. Running this code prints this output:

ping 10
pong 9
ping 8
pong 7
ping 6
pong 5
ping 4
pong 3
ping 2
pong 1
ping 0
ping done
pong done

We’ll revisit these examples later during the course of building our interpreter, and understand them as well as the code that implements them.

The Co Interpreter

The Co interpreter works in two stages:

  1. Parsing: a parser converts Co source code to Abstract Syntax Tree (AST).
  2. Interpretation: a tree-walking interpreter walks the AST, executes the instructions and produces the output.

Stages of the Co interpreter

In this post, we implement the parser for Co. In the second part, we create a first cut of the interpreter that supports the basic features of Co. In the third part, we extend the interpreter to add support for coroutines and channels.

The complete code for the parser is here. You can load it in GHCi using stack (by running stack co-interpreter.hs), and follow along while reading this article.

Let’s start with listing the extensions and imports needed5.

{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE LambdaCase #-}
{-# LANGUAGE RecordWildCards #-}

import Control.Monad (forM_, unless, void, when)
import Control.Monad.Base (MonadBase)
import Control.Monad.Combinators.Expr (Operator (..), makeExprParser)
import Control.Monad.Cont (ContT, MonadCont, callCC, runContT)
import Control.Monad.Except (ExceptT, MonadError (..), runExceptT)
import Control.Monad.IO.Class (MonadIO, liftIO)
import Control.Monad.State.Strict (MonadState, StateT, evalStateT)
import qualified Control.Monad.State.Strict as State
import Data.IORef.Lifted
import qualified Data.Map.Strict as Map
import Data.Maybe (fromMaybe)
import Data.Sequence (ViewL ((:<)), (|>))
import qualified Data.Sequence as Seq
import Data.Void (Void)
import Text.Megaparsec hiding (runParser)
import Text.Megaparsec.Char
import qualified Text.Megaparsec.Char.Lexer as L
import Text.Pretty.Simple
  ( CheckColorTty (..),
    OutputOptions (..),
    defaultOutputOptionsNoColor,
    pPrintOpt,
  )

Next, let’s take a look at the Co AST.

The Co AST

Since Co is an imperative programming language, it is naturally Statement oriented. A statement describes how an action is to be executed. A Co program is a list of top-level statements.

Statements have internal components called Expressions. Expressions evaluate to values at program run time. Let’s look at them first.

Expressions

We represent Co expressions as a Haskell Algebraic data type (ADT) with one constructor per expression type:

data Expr
  = LNull
  | LBool Bool
  | LStr String
  | LNum Integer
  | Variable Identifier
  | Binary BinOp Expr Expr
  | Call Identifier [Expr]
  | Receive Expr
  deriving (Show, Eq)

type Identifier = String

data BinOp = Plus | Minus | Equals | NotEquals | LessThan | GreaterThan
  deriving (Show, Eq)
LNull

The literal null. Evaluates to the null value.

LBool Bool

The boolean literals, true and false. Evaluate to their counterpart boolean values.

LStr String

A string literal like "towel". Evaluates to a string.

LNum Integer

An integer literal like 1 or -5. Evaluates to an integer.

Variable Identifier

A variable named by an identifier like a1 or sender. Evaluates to the variable’s value at the point in the execution of code. An Identifier is a string of alphanumeric characters, starting with an alpha character.

Binary Op Expr Expr

A binary operation on two expressions. Example: 1 + 41 or 2 == "2". Supported binary operations are defined by the BinOp enum: addition/concatenation, subtraction, equality and inequality checks, and less-than and greater-than comparisons.

Call Identifier [Expr]

Calls a function named by an identifier, with the given argument expressions. Example: calcDistance(start, end).

Receive Expr

Receives a value from a channel. Examples:

// receive a value from the channel and prints it
print(<- someChannel);
// receive a value from the channel and assigns it
var x = <- someChannel;

Next, we see how statements are represented in the AST.

Statements

We represent the Co statements as a Haskell ADT with one constructor per statement type:

data Stmt
  = ExprStmt Expr
  | VarStmt Identifier Expr
  | AssignStmt Identifier Expr
  | IfStmt Expr [Stmt]
  | WhileStmt Expr [Stmt]
  | FunctionStmt Identifier [Identifier] [Stmt]
  | ReturnStmt (Maybe Expr)
  | YieldStmt
  | SpawnStmt Expr
  | SendStmt Expr Identifier
  deriving (Show, Eq)

type Program = [Stmt]
ExprStmt Expr

A statement with just an expression. The expression’s value is thrown away after executing the statement. Example: 1 + 2;

VarStmt Identifier Expr

Defines a new variable named by an identifier and sets it to an expression’s value. In Co, variables must be initialized when being defined. Example: var a = 5;

AssignStmt Identifier Expr

Sets an already defined variable named by an identifier to an expression’s value. Example: a = 55 + "hello";

IfStmt Expr [Stmt]

Executes a list of statements if the condition expression evaluates to a truthy value. In Co, null and false values are non-truthy, and every other value is truthy. Also note that, there are no else branches for if statements in Co. Example:

if (a == 1) {
  print("hello");
}
WhileStmt Expr [Stmt]

Executes a list of statements repeatedly until the condition expression evaluates to a truthy value. Example:

var n = 0;
while (n < 5) {
  print(n);
  n = n + 1;
}
FunctionStmt Identifier [Identifier] [Stmt]

Defines a function with a name, a list of parameter names, and a list of body statements. Example:

function greet(greeting) {
  print(greeting + " world");
}
ReturnStmt (Maybe Expr)

Returns from a function, optionally returning an expression’s value. Example:

function square(x) {
  return x * x;
}
YieldStmt

Suspends the currently executing ToC so that some other ToC may run. The current ToC resumes later from statement next to the yield statement. Example:

function printNums(start, end) {
  var i = start;
  while (i < end + 1) {
    print(i);
    yield;
    i = i + 1;
  }
}
SpawnStmt Expr

Starts a new ToC in which the given expression is evaluated, which runs concurrently with all other running ToCs. Example:

spawn printNums(1, 4);
printNums(11, 15);
SendStmt Expr Identifier

Sends an expressions’s value to a channel named by an identifier. Example: 1 + square(3) -> someChannel;

The Co AST is minimal but it serves our purpose. Next, let’s figure out how to actually parse source code to AST.

Parsing

Parsing is the process of taking textual input data and converting it to a data structure—often a hierarchal structure like AST—while checking the input for correct syntax. There are many ways of writing parsers6, Parser Combinators@10 being one of them. Parser combinators libraries are popular in Haskell because of their ease of use and succinctness. We are going to use one such library, Megaparsec, to create a parser for Co.

Let’s start with writing some basic parsers for the Co syntax using the Megaparsec parsers.

type Parser = Parsec Void String

sc :: Parser ()
sc = L.space (void spaceChar) lineCmnt blockCmnt
  where
    lineCmnt = L.skipLineComment "//"
    blockCmnt = L.skipBlockComment "/*" "*/"

lexeme :: Parser a -> Parser a
lexeme = L.lexeme sc

symbol :: String -> Parser String
symbol = L.symbol sc

parens, braces :: Parser a -> Parser a
parens = between (symbol "(") (symbol ")")
braces = between (symbol "{") (symbol "}")

semi, identifier, stringLiteral :: Parser String
semi = symbol ";"
identifier = lexeme ((:) <$> letterChar <*> many alphaNumChar)
stringLiteral = char '"' >> manyTill L.charLiteral (char '"') <* sc

integer :: Parser Integer
integer = lexeme (L.signed sc L.decimal)

In the above code, Parser is a concise type-alias for our parser type7.

The sc parser is the space consumer parser. It dictates what’s ignored while parsing the sources code. For Co, we consider the Unicode space character and the control characters—tab, newline, carriage return, form feed, and vertical tab—as whitespaces. We use the spaceChar parser to configure that. sc also lets us configure how to ignore line comments and block comments while parsing.

The lexeme combinator is for parsing Lexemes while ignoring whitespaces and comments. It is implemented using the lexeme combinator from Megaparsec.

The symbol combinator is for parsing symbols, that is, verbatim strings like if and ;. It is implemented using the symbol combinator.

The parens and braces combinators are for parsing code surrounded by parentheses ( and ) and braces { and } respectively.

The semi parser matches a semicolon ;. The identifier parser is for parsing identifiers in Co. The stringLiteral parser matches a string literal like "polyphonic ringtone". integer is the parser for Co integers.

Let’s also write some functions to run the parsers and pretty-print the output in GHCi:

runParser :: Parser a -> String -> Either String a
runParser parser code = do
  case parse parser "" code of
    Left err -> Left $ errorBundlePretty err
    Right prog -> Right prog

pPrint :: (MonadIO m, Show a) => a -> m ()
pPrint =
  pPrintOpt CheckColorTty $
    defaultOutputOptionsNoColor
      { outputOptionsIndentAmount = 2,
        outputOptionsCompact = True,
        outputOptionsCompactParens = True
      }

That completes our basic setup for parsing. Let’s try them out in GHCi now:

> runParser identifier "num1 "
Right "num1"
> runParser stringLiteral "\"val\"  "
Right "val"
> runParser integer "1  "
Right 1
> runParser integer "-12  "
Right (-12)

They work as expected. Next, off to parsing Co expressions.

Parsing Expressions

Parsing Co expressions to AST requires us to handle the Associativity and Precedence of the operators. Fortunately, Megaparsec makes it easy with the makeExprParser combinator. makeExprParser takes a parser to parse the terms and a table of operators, and creates the expression parser for us.

  • Terms are parts of expressions which cannot be broken down further into sub-expressions. Examples in Co are literals, variables, groupings and function calls.
  • The table of operators is a list of Operator Parser Expr lists ordered in descending precedence. All operators in one list have the same precedence but may have different associativity.

This is a lot to take in but looking at the code makes it clear. First, the operator table:

operators :: [[Operator Parser Expr]]
operators =
  [ [Prefix $ Receive <$ symbol "<-"],
    [ binary Plus $ symbol "+",
      binary Minus $ try (symbol "-" <* notFollowedBy (char '>'))
    ],
    [ binary LessThan $ symbol "<",
      binary GreaterThan $ symbol ">"
    ],
    [ binary Equals $ symbol "==",
      binary NotEquals $ symbol "!="
    ]
  ]
  where
    binary op symP = InfixL $ Binary op <$ symP

The prefix operator <-, for receiving values from channels, is of highest precedence and hence the first in the table. Next are the binary + and - operators which are of same precedence. After that come the comparison operators < and >. Finally, we have the lowest precedence operators, the equality and inequality checks = and !=.

Each operator also contains the parser to parse the operator symbol in the source code. All of them are self-explanatory except the parser for the - operator. The - parser is slightly complicated because we want it to not parse ->, the channel send operator, the first character of which is same as the symbol for the - operator.

Moving on to the term parser next:

term :: Parser Expr
term =
  LNull <$ symbol "null"
    <|> LBool True <$ symbol "true"
    <|> LBool False <$ symbol "false"
    <|> LStr <$> stringLiteral
    <|> LNum <$> integer
    <|> try (Call <$> identifier <*> parens (sepBy expr (char ',' *> sc)))
    <|> Variable <$> identifier
    <|> parens expr

The term parser is a combination of smaller parsers—one for each type of terms in Co—combined together using the Alternative instance of the parsers. It matches each parser one by one from the top until the match succeeds8. First, it tries to match for literals null, true, and false, failing which it matches for string and integer literals. Then it matches for function calls, variables, and expressions grouped in parentheses—in that order.

That’s it! We finally write the expression parser:

expr :: Parser Expr
expr = makeExprParser term operators

Let’s play with it in GHCi:

> pPrint $ runParser expr "1 + a < 9 - <- chan"
Right
  ( Binary LessThan
    ( Binary Plus ( LNum 1 ) ( Variable "a" ) )
    ( Binary Minus ( LNum 9 ) ( Receive ( Variable "chan" ) ) ) )
> pPrint $ runParser expr "funFun(null == \"ss\" + 12, true)"
Right
  ( Call "funFun"
    [ Binary Equals LNull
      ( Binary Plus ( LStr "ss" ) ( LNum 12 ) )
    , LBool True ] )
> pPrint $ runParser expr "-99 - <- chan + funkyFun(a, false, \"hey\")"
Right
  ( Binary Plus
    ( Binary Minus ( LNum ( - 99 ) ) ( Receive ( Variable "chan" ) ) )
    ( Call "funkyFun" [ Variable "a", LBool False, LStr "hey" ] ) )

Done! Onward, to parsing Co statements.

Parsing Statements

For parsing statements, we reuse the same trick we used for parsing expressions: combine smaller parsers for each statement type using Alternative.

stmt :: Parser Stmt
stmt =
  IfStmt <$> (symbol "if" *> parens expr) <*> braces (many stmt)
    <|> WhileStmt <$> (symbol "while" *> parens expr) <*> braces (many stmt)
    <|> VarStmt <$> (symbol "var" *> identifier) <*> (symbol "=" *> expr <* semi)
    <|> YieldStmt <$ symbol "yield" <* semi
    <|> SpawnStmt <$> (symbol "spawn" *> expr <* semi)
    <|> ReturnStmt <$> (symbol "return" *> optional expr <* semi)
    <|> FunctionStmt
      <$> (symbol "function" *> identifier)
      <*> parens (sepBy identifier (char ',' *> sc))
      <*> braces (many stmt)
    <|> try (AssignStmt <$> identifier <*> (symbol "=" *> expr <* semi))
    <|> try (SendStmt <$> expr <*> (symbol "->" *> identifier <* semi))
    <|> ExprStmt <$> expr <* semi

Most of the statements start with keywords like if, var, function, etc, so our parsers look for the keywords first using the symbol parser. If none of the keyword-starting statements match then we try9 matching for assignments, channel send, and expression statements, in that order. The stmt parser uses the expr parser to parse expressions within statements. It also uses the combinators many, optional, try and sepBy from Megaparsec.

Finally, we put everything together to create the parser for Co:

program :: Parser Program
program = sc *> many stmt <* eof

The Co parser matches multiple top-level statements, starting with optional whitespace and ending with end-of-file.

It’s demo time. In GHCi, we parse a file and pretty-print the AST:

> readFile "coroutines.co" >>= pPrint . runParser program
Right
  [ FunctionStmt "printNums"
    [ "start", "end" ]
    [ VarStmt "i"
      ( Variable "start" )
    , WhileStmt
      ( Binary LessThan
        ( Variable "i" )
        ( Binary Plus ( Variable "end" ) ( LNum 1 ) ) )
      [ ExprStmt
        ( Call "print" [ Variable "i" ] )
      , YieldStmt
      , AssignStmt "i"
        ( Binary Plus ( Variable "i" ) ( LNum 1 ) ) ] ]
  , SpawnStmt
    ( Call "printNums" [ LNum 1, LNum 4 ] )
  , ExprStmt
    ( Call "printNums" [ LNum 11, LNum 15 ] ) ]

I’ve gone ahead and created a side-by-side view of the source code and corresponding AST for all three input files, aligned neatly for your pleasure of comparison. If you so wish, feast your eyes on them and check for yourself that everything works correctly.

Source code and AST for fib.co
// Fibonacci numbers 
// using a while loop
var a = 0;
var b = 1;
var j = 0;
while (j < 6) {
  print(a);
  var temp = a;
  a = b;
  b = temp + b;
  j = j + 1;
}

// Fibonacci numbers 
// using recursive function call
function fib(n, f) {
  if (n < 2) {
    return n;
  }
  return f(n - 2, f) 
    + f(n - 1, f);
}

var i = 0;
while (i < 6) {
  print(fib(i, fib));
  i = i + 1;
}


[ VarStmt "a" ( LNum 0 )
, VarStmt "b" ( LNum 1 )
, VarStmt "j" ( LNum 0 )
, WhileStmt ( Binary LessThan ( Variable "j" ) ( LNum 6 ) )
  [ ExprStmt ( Call "print" [ Variable "a" ] )
  , VarStmt "temp" ( Variable "a" )
  , AssignStmt "a" ( Variable "b" )
  , AssignStmt "b" ( Binary Plus ( Variable "temp" ) ( Variable "b" ) )
  , AssignStmt "j" ( Binary Plus ( Variable "j" ) ( LNum 1 ) )
  ]



, FunctionStmt "fib" [ "n", "f" ]
  [ IfStmt ( Binary LessThan ( Variable "n" ) ( LNum 2 ) )
    [ ReturnStmt ( Just ( Variable "n" ) )
    ]
  , ReturnStmt ( Just ( Binary Plus ( Call "f" [ Binary Minus ( Variable "n" ) ( LNum 2 ), Variable "f" ] )
                                    ( Call "f" [ Binary Minus ( Variable "n" ) ( LNum 1 ), Variable "f" ] ) ) )
  ]

, VarStmt "i" ( LNum 0 )
, WhileStmt ( Binary LessThan ( Variable "i" ) ( LNum 6 ) )
  [ ExprStmt ( Call "print" [ Call "fib" [ Variable "i", Variable "fib" ] ] )
  , AssignStmt "i" ( Binary Plus ( Variable "i" ) ( LNum 1 ) )
  ] ]
Source code and AST for coroutines.co
function printNums(start, end) {
  var i = start;
  while (i < end + 1) {
    print(i);
    yield;
    i = i + 1;
  }
}

spawn printNums(1, 4);
printNums(11, 15);
[ FunctionStmt "printNums" [ "start", "end" ]
  [ VarStmt "i" ( Variable "start" )
  , WhileStmt ( Binary LessThan ( Variable "i" ) ( Binary Plus ( Variable "end" ) ( LNum 1 ) )
    [ ExprStmt ( Call "print" [ Variable "i" ] )
    , YieldStmt
    , AssignStmt "i" ( Binary Plus ( Variable "i" ) ( LNum 1 ) )
    ]
  ]

, SpawnStmt ( Call "printNums" [ LNum 1, LNum 4 ] )
, ExprStmt ( Call "printNums" [ LNum 11, LNum 15 ] ) ]
Source code and AST for ping.co
var chan = newChannel();

function player(name) {
  while (true) {
    var n = <- chan;
    if (n == "done") {
      print(name + " done");
      return;
    }
    
    print(name + " " + n);
    if (n > 0) {
      n - 1 -> chan;
    }
    if (n == 0) {
      print(name + " done");
      "done" -> chan;
      return;
    }
  }
}

spawn player("ping");
spawn player("pong");
10 -> chan;
[ VarStmt "chan" ( Call "newChannel" [] )

, FunctionStmt "player" [ "name" ]
  [ WhileStmt ( LBool True )
    [ VarStmt "n" ( Receive ( Variable "chan" ) )
    , IfStmt ( Binary Equals ( Variable "n" ) ( LStr "done" ) )
      [ ExprStmt ( Call "print" [ Binary Plus ( Variable "name" ) ( LStr " done" ) ] )
      , ReturnStmt Nothing
      ]

    , ExprStmt ( Call "print" [ Binary Plus ( Binary Plus ( Variable "name" ) ( LStr " " ) ) ( Variable "n" ) ] )
    , IfStmt ( Binary GreaterThan ( Variable "n" ) ( LNum 0 ) )
      [ SendStmt ( Binary Minus ( Variable "n" ) ( LNum 1 ) ) "chan"
      ]
    , IfStmt ( Binary Equals ( Variable "n" ) ( LNum 0 ) )
      [ ExprStmt ( Call "print" [ Binary Plus ( Variable "name" ) ( LStr " done" ) ] )
      , SendStmt ( LStr "done" ) "chan"
      , ReturnStmt Nothing
      ]
    ]
  ]

, SpawnStmt ( Call "player" [ LStr "ping" ] )
, SpawnStmt ( Call "player" [ LStr "pong" ] )
, SendStmt ( LNum 10 ) "chan" ]

That’s all for now. In exactly 102 lines of code, we have implemented the complete parser for Co. In the next part, we’ll implement a tree-walking interpreter for the Co AST that supports the basic features.

The full code for the parser can be seen here. You can discuss this post on lobsters, r/haskell, discourse, twitter or in the comments below.

Acknowledgements

Many thanks to Steven Deobald for reviewing a draft of this article.

Bartel, Joe. Non-Preemptive Multitasking.” The Computer Journal, no. 30 (May 2011): 37–38, 28. http://cini.classiccmp.org/pdf/HT68K/HT68K%20TCJ30p37.pdf.
Hoare, C A R. Communicating Sequential Processes. Prentice Hall, 1986. https://doi.org/10.1145/359576.359585.
Hutton, Graham. “Higher-Order Functions for Parsing.” Journal of Functional Programming 2, no. 3 (1992): 323–43. https://doi.org/10.1017/S0956796800000411.
Knuth, Donald E. “Coroutines.” In The Art of Computer Programming: Volume 1: Fundamental Algorithms, 3rd ed., 193–200. Addison Wesley, 1997.
Watson, Des. “Approaches to Syntax Analysis.” In A Practical Approach to Compiler Construction, 75–93. Springer International Publishing, 2017. https://doi.org/10.1007/978-3-319-52789-5_4.

  1. Coroutines are often conflated with Fibers. Some implementations of coroutines call themselves fibers instead. Quoting Wikipedia:

    Fibers describe essentially the same concept as coroutines. The distinction, if there is any, is that coroutines are a language-level construct, a form of control flow, while fibers are a systems-level construct, viewed as threads that happen to not run in parallel. It is contentious which of the two concepts has priority: fibers may be viewed as an implementation of coroutines, or as a substrate on which to implement coroutines.

    ↩︎
  2. Green threads are another popular solution for lightweight concurrency. Unlike coroutines, green threads are preemtable. Unlike Threads, they are scheduled by the language runtime instead of the operating system.↩︎

  3. In the interest of simplicity of implementation, Co does not support recursive functions. However, it does support first-class functions. So for the fib function to call itself, we pass it itself as an argument.↩︎

  4. The complete code for the interpreter is here.↩︎

  5. We are using:

    ↩︎
  6. I explore parsing in more depth by writing a JSON parser from scratch in Haskell in one of my previous posts.↩︎

  7. I followed this (outdated) Megaparsec tutorial for writing a parser for an imperative language. This tutorial is also helpful in learning the intricacies of Megaparsec.↩︎

  8. In the parsing parlance this is called Backtracking. The alternative is to do a Lookahead@12.↩︎

  9. Notice how some of the statement type parsers have try in front of them while most don’t. The try parser combinator backtracks to the start of its input when its argument parser fails, but so does the <|> operator. What’s the difference then?

    The difference is that for parsers, the <|> operator can work only with a known look ahead, while the try combinator can work with arbitrary look ahead. Since the symbol parser knows the exact symbol to look for, it requires a known look ahead and, hence, works with <|>. However, the identifier and expr parsers don’t know their look aheads beforehand and, hence, need to be wrapped with try.

    However, as commented by Gilmi, this kind of use of try may cause confusing error messages. We use it here only for the sake of simplicity.↩︎

Posted by

Like this post? Subscribe to get future posts by email.

Twitter

Like or Retweet this post on Twitter
Cancel Reply

Got suggestions, corrections, or thoughts? Post a comment!

Markdown is allowed
Email is used just to show an avatar image and is not displayed or stored
Comments are moderated. They will appear below after they are approved.

7 comments

HuwCampbell

This is great. Cheers.

noodlesteak

Hey really looking forward to follow your posts on my own. Just finished learning and implementing parser combinators in Java and Rust, and as a big fan of FP and Elixir I always wanted to make such a language, plus you do it in Haskell which I planned to get into as well ! You came right at the right moment for me thank you :)

Thanks 😊. I hope this is useful.

funny_falcon

Historical remark: lightweight thread concurrency is a very old concept. Stackless Python - is very young for example. Simula had actors, Scheme were born in the study of CSP, even C++ in the first versions (when it were called “C with classes”) had coroutines.

Netherless, interesting article.

gilmi

A very interesting post. Thank you!

I have one question and one comment:

Question: I don’t understand how the ping pong example works. You mention that using coroutines a process needs to yield the ToC, but I don’t see any yielding in the program. What’s going on?

Comment: I think this use of try will make some of the reported errors cryptic and inaccurate. Because if for example if there is a parse error in the third argument in a function call, the entire function call expression is given up on and instead the parser will try the identifier route.

Instead, you want to do left factoring: parse the common prefixes together and then choose between the function call route (which starts with parens) and simply returning the identifier. try a <|> b considered harmful talks about this topic in length.

Thanks a lot for the kind words :slight_smile:. I learned a lot about this topic from your Streama screencasts.

This is the first part of a three-part series. I’ll go over how coroutines and channels work in the third part. But in short, sending and receiving over channels implicitly yields.

I agree. I wrote it this way to keep things simple. I didn’t want to add extra mental overhead for the readers.

gilmi

I see! I’ll keep an eye for the next posts :slight_smile:

Keeping it simple is good. I would suggest mentioning the article above just in case someone runs into issues.

I think concurrency in PLT is very interesting and important and there aren’t a lot of non academic articles about this topic, so it’s great that you are writing about it! The article and examples are pretty clear to me overall. Good job!

And I’m glad to hear my streams were helpful :slight_smile:

12 Mentions

2 Reposts abhin4vनिर्भीक चौहान