Although the article doesn’t require any mathematical background, it’ll be useful to have high-level understanding of how compilers work. The article is inspired by the book “Types and Programming Languages ”.
The source code of the implementation can be found at myGitHub profile.
The syntax of the language is going to be quite simple.
Our programs will belong to the language defined by the following grammar:
The abstraction nonterminal allows us to declare functions with only one argument of type
(T) ******************** and an expressionas a body. The syntax of the abstractions involves the unicode symbols
→and. In case you have a Greek keyboard layoutλwill be easy to type, however, in order to quickly type
→
You will need some kind of a keyboard shortcut. In other words, the syntax of our language is not very ergonomic.
Note that there's no syntax construct for expressing negative numbers and on top of thatsucc
andpred
produce natural numbers. In order to be consistent with the grammar the result ofpred 0will be(0) . **********
Regarding the type system, we will have two primitive types:
Notice that in contrast to other statically typed programming languages, such as Haskell or Elm, we do not have a syntactical construct for declaring the type of a function. In Haskell the semantics of the annotationT1 → T2is a function which accepts one argument of type (T1) *********************** and returns result of typeT2. In our grammar, we do not have this explicit type annotation because we’re going to apply type inference in order to guess a function’s type based on the type of its argument and body.
Now after we’re familiar with the syntax, we can try to write some code. In the textarea below enter your program and click on the button “Evaluate” to see the result:
Result:**************************
Before going any further, lets show a few sample programs belonging to the programming language that we’re going to develop.
(λ a: Nat → succ succ a) 0.
λ f: Nat → (λ g: Nat → f) 0 ) (succ 0)In the example above we declare an anonymous function which accepts a single argument called
aof type
NatIn the body of the function we apply twice the predefined functionsucctoa. The anonymous function we apply to0
. This is going to produce2
as final result.The equivalent of this program in JavaScript will be:
Wheresucc
is defined as:
A more complicated program in our programming language will look like this:
) in the expression(λ f: Nat → (λ g: Nat → f) 0) and getLets explain the evaluation step by step:
Reduce the expression with left side
(λ f: Nat → (λ g: Nat → f) 0)and right side(succ 0). λ f: Nat → (λ g: Nat → f) 0 ) (succ 0)
Substitute
gwith((apply
beta reduction
(λ f: Nat → f) (*********************. λ f: Nat → f ) (succ 0)
Increment
0in the expressionsucc 0and pass the result to(λ f: Nat → f).
Return result.
Formally, we can show the small -step semantics of our language in the following way:
Small-step semantics
In this section we'll take a look at thesmall-step semantics
of the language.
Lets suppose
σis the current state of our program, which keeps what value each individual variable has.Variables1) ───────────── (x, σ) → σ (x)
σ (********************. ********** Built-in functions (***********************************(1)
simply means that for variablexWe return its value kept in the stateThe built-in functions in our language aresucc (*********************,predandiszero [1] ****************. Here's the small-step semantics ofsucc and
(1)
simply means that if expressiont1 (**********************, passed tosucc (evaluates to) ****************** (t2) ***********************, succ************ (evaluates to) ******************** (succ t2) ***********************. ********************** () we define that the result of
pred 0 (equals) ****************** (0) *********************** in order to keep the language consistent and disallow negative numbers not only syntactically but also as result of evaluation.
We're going to defineiszerothe following way:
1) iszero 0 → true 2) iszero succ v → false 3) t1 → t2 ──────────────────── iszero t1 → iszero t2
This means that iszero
************ (applied to(0) *********************** returnstrue () (by
) 1)). The result of the evaluation ofiszero
applied to any other number greater than
(0) will equal
false
(by) 2)). Ift1evaluates to
t2, theniszero t1equals the result of the evaluation
, and apply it tov, we need to substitute all the occurrences ofiszero t2(by3)). Conditional expressions (**********************************
Lets take a look at the small-step semantics for the conditional expressions:
1) if true then t2 else t3 → t2 if false then t2 else t3 → t3If the condition of the conditional expression is
2) t1 → v ───────────────────────────────────────────── if t1 then t2 else t3 → if v then t2 else t3truethen we return the expression from thethenpart, otherwise, we return the one from the
elsepart.
If given expression
t1 (evaluates to) **************************** and this expression is passed as condition of the conditional expression, the evaluation of the conditional expression is equivalent to the evaluation of the conditional expression with (v) ******************** (passed as condition.) Abstraction & Application (**********************************In this section we’ll explain the function evaluation.
1) (λ x: T → t) v → {x → v} t 2) t1 → v ──────────── t1 t2 → v t2 3) t2 → v ──────────── v1 t2 → t1 v(1)
means that if we have the abstraction(λ x: T → t) (*********************,(T) ********************* is the type of
x
xin
2)and3)are quite similar. In1)twith
v.In2)
********** (if
t1evaluates to,t1 t2evaluates toapplied tot2. **********
1) t: Nat ───────────── succ t: Nat 2) t: Nat ───────────── pred t: Nat 3) t: Nat ─────────────── iszero t: Bool 4) t1: Bool, t2: T, t3: T ─────────────────────────── if t1 then t2 else t3: T 5) (λ x: T → y): T → Y, u: T, y: Y ──────────────────────────────── (λ x: T → y) u: Y 6) t1: T → Y, t2: T ────────────────── t1 t2: YThe semantics of3)************ (is that ift2 (evaluates to) ************** (v) ***********************,v1 t2 (evaluates to) ****************** (t1) *********************** (applied to
v) ***********************. ************Although the small-step semantics laws above are quite descriptive and by using them we already can build an evaluator for our programming language, we still can construct some ridiculous programs. For instance the following program is a valid according to the grammar from the “Syntax”section above but is semantically incorrect:
The condition of the conditional expression is expected to be of type boolean, however, we pass a natural number. Another problem with the snippet is that the result of both branches of the expression should return result with the same type but this is not the case in our example.
In order to handle such invalid programs we can introduce a mechanism of program verification through (type checking****************. This way, we will assign types to the individual constructs andas part of the compilation process, verify if the program is valid according to the “contract signed” with the type annotations.
Notice that the (type checking will be performed compile-time. The alternative is to perform runtime type checking, which will not prevent us from distributing invalid programs.
An interesting question is if the type system is (sound and complete) . If the system is sound it should prevent false negatives(ie should reject all invalid programs), however, it may also reject valid programs. In order to make sure the type systemprevents false positives, it should also be complete.
Type Rules
Lets define that:
true: Bool false: Bool n: Nat, n ℕ ℕ
Based on the types of our terminals, lets declare the type rules for
(succ) ((pred) *********************,iszero and the conditional expression:(1)
,and2)we declare that if we have an expressiontof type(Nat
, then bothpred tandsucc twill be of type
Nat. On the other hand,iszeroaccepts an argument of typeNatand results of typeBool. **********states that the condition of the conditional expression should be of type(Bool and the expressions in thethenand(branches should be the of the same typeT, where we can think of (T) ******************** as a generic type (placeholder which can be filled with any type, for instance (Bool) ******************** (or) ****************** (Nat, even (Nat → Bool)
Rule5)states that if a function has typeT → Y and (is applied to argument(u) ******************** (of type) ****************** (T) then the result of the evaluation will be of type
Y.
Finally,6)
************ states that if we havet1 (of type) ****************** (T → Y) ******************** (andt2 (of type) ************** (T) ********************* (thent1 t2 (will be of type) ****************. **********
It’s interesting to see how the type rulesdoes not explain how to evaluate the individual expressions but only define a set of rules which the expressions should hold.
********
Now from the formal definition of our programming language lets move to its actual implementation. In this section we will explain how the compiler’s implementation works. Here are the high-level steps of execution:
****************************************************************************** const (******************************** program**********************************************=(******************************** readFileSync********************************************** (******************************************** (fileName) ********************** (**************************************, () ****************************************** (encoding) ********************** (******************************************: utf-8 ‘}); const**************************************** (ast)
=
(parse)(
(program));
const****************************************** (diagnostics)=
(Check)(
(ast)).
******************************************** diagnostics; if****************************************** ( (diagnostics) **********************************************length
){
console****************************************(error) ********************** (red
(diagnostics
********************************************.join
' n'
));process
****************************************(exitif****************************************** ( (compile) ********************** (compile) )************************** (1
);
{
result******************************************== (CompileJS) ************************************************** (ast
); console****************************************
(log************************** (result
);****************************************** (else){
console****************************************(log************************** (Eval
**************************** (ast
));In the pseudo code above, we can notice that the program’s compilation & interpretation happen the following way:
Read the file containing our program.
Parse the source code and get an********************************** (abstract syntax tree) .
Perform type checking with the****** (Check) ******************** (function.) In case the type checker has found type errors, report the diagnostics.
algorithm for producing an (A) ************************************** (bstract) *************************************** (S) yntax (T) ******************************************* (AST).Either compile to JavaScript or evaluate the program.
It's lovely to see a functional level. of cohesion of the individual components! In the last next four sections we’ll explain steps 2-5.
Lexer & Parser
The implementation of a lexer and parser for this small language will be quite simple. We can use traditionalrecursive descent parsing
For diversity, lets generate the modules for lexical and syntax analysis by using (Peg.js
grammar.)
Here’s the grammar itself:
****************************************************************************** (Program) ************************=************** ()
_ (a) ********************** (********************************************: (Application) ************************ ()********************************************** ')'?************************** (_){ return**************************************** (a);
Application******************************************== (l:ExprAbs
(r):
**************************************** (Application)*
************************************** ({***********************) if****************************************** (!
************************** (r)||
****************************!r******************************.length{ return**************************************** (l);
****************************************** (else){
r******************************************== (r************************pop
(); return****************************************** ({***********************) (type:a===b). ************ Type Checking Implementation'application'
(**********************************************,left
(**********************************************: (l****************************************, ************************ (right) **********************************************************: ************************ (r) ********************************************}; ; Abstraction******************************************== (_'('
**********************: Type*
****************************************** (_)'λ'
******************************************** (_) ************************ (id) **********************************************: ************************ (Identifier) **********************':' (********************************** (t) **********************'→'f
****************************:Application******************************************** _________________')'********************************************? ********************** _ ({************************) return****************************************** ({***********************) (type:'abstraction'
(**********************************************,arg(**********************************************:
********************* (id) }, (body){
**************************************** (type************ (**********************************************:******************************************** (t) *************************, (******************************************* (id) ********************** () **********************************************: () **********************:
(f)};
Expr******************************************== (IfThen/IsZeroCheck
/ArithmeticOperation /
ArithmeticOperation******************************************== (o:Zero
/True
/ (************************************ (False) / (********************************** (Identifier) **********************/ (****************************************** ParanExpressionOperator
(e):
**************************************** (Application){
return****************************************** ({***********************) (type:6))'arithmetic'
(**********************************************,operator
****************************************:o
****************************************, ************************ (expression**********************************************: ************************e******************************};; Operator******************************************== Succ /Pred
IfThen******************************************== (If************************** (expr):
(Application)Then
(then:
**************************************** (Application)Else
**************************************** (el************ (**********************************************: (Application) *************************{
return****************************************** ({***********************) (type:****************************************************************************** (abstraction) ************************=************** ()'conditional_expression'
(**********************************************,condition
******************************************:expr****************************************, ************************
(**********************************************: (***************************************** (el(**********************************************}; (********************************************** Type******************************************== (Nat) **********************/then
**********************************************: ************************ (then(**************************************, (elBool
_******************************************==[trn]
* __******************************************==[trn]
( ) Identifier******************************************==!
************************** (ReservedWord)l (: ExprAbs) and the right one****************** (:), after that, in case of a match, we return an object, which will be used as an AST node, with type (abstraction) ****************, leftand (right) ******************** (branches.)_
(id):
[a-z]******************************************** _ (******************************** { return****************************************** ({***********************) (name) **********************:id
(**********************************************.join
(''
, (********************************************** (type**********************************************: ************************ ('identifier') ************************; ParanExpression******************************************== (_'('
/_
******************************************** (expr):
**************************************** (Expr)_
')' (****************************** ()************************************ ({************************* return****************************************** (expr);
ReservedWord******************************************== (IfThen
/Else
/Pred
/Succ
/Nat
/Bool /IsZero
/False
Succ******************************************== (_'succ'
******************************************== (_ 'Nat'_
****************************************** ({***********************) return****************************************** 'succ';
Nat_
****************************************** ({***********************) return****************************************** 'Nat';
(**********************************************. ...)For simplicity I’ve omitted some implementation details. The entire grammar can be foundhere.
We’ll take a look at only a few grammar rules. If you're interested in the entire syntax of Peg.js, you can take a look at the official (documentation) ************ and experiment on theonline playground
If you're interested into some theoretical background, I'd recommend you to find a read for (BNF formLets take a look at two interesting nonterminals from the grammar:
****************************************************************************** (Application) ************************=(l) ************************ (********************************: (ExprAbs) ************************ (r(**********************************************: (Application) ********************** (****************************************** { if
****************************************** (!
************************** (r)||
****************************!r******************************.length{ return**************************************** (l);
****************************************** (else){
r******************************************== (r************************pop
(); return****************************************** ({***********************) (type:
section above can be expressed with this Peg rule. The semantics behind the rule is that, we can have one expression or abstraction followed by0or more other applications.'application'
(**********************************************,left
(**********************************************: (l****************************************, ************************ (right) **********************************************************: ************************ (r) ********************************************}; ;The application nonterminal (tt) from the“Syntax”
We name the left nonterminal
Another interesting nonterminal is the abstraction:
**************************************** _ ('λ') ************************ ()************************************ (id) ********************************************: Identifier ':'****************************************** (t) (**********************************************: (Type) ************************ (→ →)******************************************** (f) ************************ (********************************************: (Application) _ (')' (********************************************?****************************************** (_) ******************************************** {{**********************) (return) ******************************************** {{**********************) type(**********************************************:'abstraction'(********************************************,arg(**********************************************:{************************************** (type(**********************************************: t (********************************, (****************************************** (id) ********************** () **********************************************: (******************************************** (id ,*********************** (body): (f)};****************************************************************************** const (******************************* (Types) ************************************************************=(****************************** {{********************** Natural****************************************We can see that it starts with an optional opening parenthesis, after that we have the lambda (λsymbol, followed by the parameter declaration and its type. After that, we have the hard to type arrow (
→) and the abstraction's body, which we try to match against the
(Application rule from above.For instance, the AST of the program:(λ a: Bool → succ 0) iszero 0, after parsing, will look like:
The root node is the application nonterminal (t1 t2
, where
t1equals the abstraction (ie): Bool → succ 0andt2the expressioniszero 0.
Developing the Type Checker
Now lets take a look at the type checker. First, lets declare the set of primitive types:
Nat (**************************************************** (rule'Nat'
, Boolean****************************************'' Bool '' ;and returns a function of type(Function Type)
We'll express the function type using an array of types, for instance
['Bool', 'Nat']is a function which accepts an argument of type
Bool
and returnsNumFrom the following example, we can see that our language supports high-order functions: (λ a: Nat → λ b: Nat → succ a) 0 succ 0
The outermost function has type
Nat → (Nat → Nat) (*********************, which means that it accepts an argument of typeNat
Nat → Nat
. (Type Checking Algorithm and Type Inference)The algorithm for performing type checking will traverse the AST and verify if each individual node has correct type according to the type rules from section“Type System”. Generally speaking, the algorithm will be just a JavaScript translation of the definitions in the “Type Rules”
section from above.
Here are the basic rules that we will implement:
Check if the condition of a conditional expression is of type
(Bool) ***********************. In order to do that, we need to invoke the algorithm recursively and find out the type of the expression passed as condition of the conditional expression (rule [i] ****************** (4).
Check if both the branches of conditional expression have the same type. Here we need to recursively find the types of both sub-expressions and compare them
(rule4)) (****************************************************************.
Check if argument passed to a function is of the correct type. In this case we need to find the type of the expression passed as argument to the function and check if it matches with the declaration of the function (rule (5)). In order to find the type of the function we need to find the type of its body. This is called type inference
Check if the arguments of the built-in functions are of the correct type. The procedure is quite similar to 3. (rules1),and
3)********************************************************Check if the types of the left and right side of an application match. We need to find the types of both terms recursively, just like for all other cases above. For instance, if we have function of type Nat → Bool, we can only apply it to an argument of type
From above we can see that each point includes the word “check”, so it looks like, an important part of the type checking algorithm is the type comparison. Lets peek at its implementation: (Comparing Types)
bare primitive types, we just compare them by their value (In order to compare two types and see if they are equivalent, we can use the following function:
****************************************************************************** const (******************************** (typeEq************************************=(******************************** (**************************************** a**************** (********************************************, (b)************************************************=>{ if****************************************** ( (a) ********************** (a) ****************************************** (instanceof) Array
&&b
(instanceof)Array
){
if****************************************** ( (a) ********************** (a) ****************************************length
(**********************************************!==b
(**********************************************.length)
************************ (i){
return****************************************** (false);
****************************************** (else){
for****************************************** ( (let=
(0);
(i)************************************** (a)
({ return****************************************** (false)**************************************** (length) ************************
if****************************************** (;
{i
=
1!
************************** (typeEq)(
(a)[i],
******************************************** (b) ************************* ([i]));
return****************************************** (true);
****************************************** (else){
if****************************************** ( (typeof************************ (a)===
'string'&&**************************************** (typeof************ (b) **********************************************************==='string'
) { return**************************************** (a)===
************************** (b);
return****************************************** (false);
;The function first checks if both types are types of a function (i.e. have more than one type they are composed of). If that's the case, we compare the types they are composed of one by one by invoking thetypeEqfunction recursively. Otherwise, in case
a
and
Now we’re ready to take a look at the actual implementation of our type checker:
****************************************************************************** const (******************************** (Check**********************************************=(******************************** (****************************************** (ast
**************** (********************************************, diagnostics************************************************=>{ diagnostics******************************************== (diagnostics) ********************************************** ()[]; / / By definition empty AST is correct****************************************) if (! ************************* (ast)) {{**********************) return****************************************** ({***********************) diagnostics ; (**********************************************. .. / / We get the type of identifier from the symbol table****************************************) } (else)if************************** (ast****************************.type******************************===(ASTNodes) ************************************************. ********************** (Identifier) **********************({ return****************************************** ({***********************) diagnostics, type****************************************(SymbolTable) ********************** (SymbolTable) ************************lookup
**************************** (ast
(**********************************************.name
) ; / / if-then-else block is correct if:****************************************) // - The condition is of type Boolean.****************************************) // - Then and else are of the same type.****************************************) } (else)if************************** (ast****************************.type******************************===(ASTNodes) ************************************************. ********************** Condition******************************************* ({ if****************************************** (!
************************ (ast).
(then)||
********************************************!ast********************************.el************************************************ (******************************!ast
(**********************************************.condition
){
diagnostics****************************************(push************************** ('No condition of a conditional expression'
; return****************************************** ({***********************) diagnostics ;
const**************************************** (c)=
(Check)(
(ast).
****************************************** (condition));
diagnostics******************************************== (diagnostics) **********************************************concat
(c
(**********************************************.diagnostics
); const****************************************** (conditionType)=
(c).
(type);
if****************************************** (!
************************** (typeEq)(
(conditionType),
**************************************** (Types).
**************************************** (Boolean)))
****************************************** ({***********************) diagnostics****************************************(push************************** ('Incorrect type of the condition of a conditional expression'
); return****************************************** ({***********************) diagnostics ; const****************************************** (thenBranch)=
(Check)(
(ast).
******************************************** (then);
diagnostics******************************************== (diagnostics) **********************************************concat
(thenBranch
**********************************************.diagnostics); const****************************************** thenBranchType
************************** (=
(thenBranch).
(type);
const****************************************** (elseBranch)=
(Check)(
(ast).
****************************************** (el));
diagnostics******************************************== (diagnostics) **********************************************concat
(elseBranch
********************************************.diagnostics); const****************************************** elseBranchType
************************** (=
(elseBranch).
(type);
if****************************************** ( (typeEqthenBranchType,
return****************************************** (thenBranch)elseBranchType
)){;
****************************************** (else){
diagnostics****************************************(push'Incorrect type of then / else branches'); return****************************************** ({***********************) diagnostics ;
/ / The type of:****************************************) // e1: T1, e2: T2, e1 e2: T1****************************************) } (else)if************************** (ast****************************.type******************************===(ASTNodes) ************************************************. ********************** (Application) ********************************************************* () ({) *********************** const**************************************** (l)=
(Check)(
(ast).
****************************************** (left));
const****************************************** (leftType)=
(l).
(type)||
********************************************************************** diagnostics******************************************== (diagnostics) **********************************************concat
(l
(**********************************************.diagnostics
); const**************************************** (r)=
(Check)(
(ast).
****************************************** (right));
const****************************************** (rightType)=
(r).
(type)||
********************************************************************** diagnostics******************************************== (diagnostics) **********************************************concat
(r
(**********************************************.diagnostics
); if****************************************** ( leftType************************length
){
if****************************************** (!
************************ (ast). We pass thediagnostics(array to each invocation of theCheckfunction. When we find a type error, we push a string representing a human readable message corresponding to the error in the array. This way, in the end of the invocation we have the list of all type errors found by the type checking algrithm..
(right)||
******************************************** leftType[0]***********************************************===(****************************************** (rightType) { return****************************************** ({***********************) diagnostics, type****************************************leftType************************** [1] ; ****************************************** (else){
diagnostics****************************************(push************************** ()'Incorrect type of application'
; return****************************************** ({***********************) diagnostics ;
****************************************** (else){
return****************************************** ({***********************) (diagnostics) }; return****************************************** ({***********************) (diagnostics) }; ;I have stripped some of the code since it’s not crucial for our purpose. If you're interested in the complete implementation, you can find it (here
An interesting thing to notice is the continuation
If we go back to the entire compiler’s implementation:
****************************************************************************** ... const**************************************** (ast)
=
(parse)(
(program));
const****************************************** (diagnostics)=
(Check)(
(ast)).
******************************************** diagnostics; if****************************************** ( (diagnostics) **********************************************length
){
console****************************************(error) ********************** (red
(diagnostics
********************************************.join
' n'
));process
****************************************(exit(**********************************************. ..************************** ('1)'
********************************;In case the program that we want to type check is the following:
(λ a: Nat → succ succ 0) iszero. trueThe diagnostics that the compiler will produce will be as follows:
~ / Projects / typed-calcmasternode index.js demo / incorrect1.lambda
Evaluating "demo / incorrect1.lambda". Incorrect type of IsZero Incorrect type of application(Developing the Interpreter)Once the compiler goes through the phase of type checking there are a few options:
Perform AST transformations in order to produce equivalent but more efficient AST for the purposes of the compiler's (back end) .Skip the transformation phase and directly perform either code generation or evaluation. (**********************************************************************
In this section we'll take a look at the interpreter which is going to evaluate our programs based on the AST produced by the parser (syntax analyzer).
Here's its entire implementation:
****************************************************************************** const (******************************** (Eval) ************************************************=ast==
{ / / The empty program evaluates to null.****************************************) if (!
************************* (ast)) {{**********************) return****************************************** (null);
/ / The literals evaluate to their values.****************************************) if (ast**************************type****************************===ASTNodes********************************************.Literal (******************************** { return**************************************** (ast) **************************************** (value);
/ / The variables evaluate to the values****************************************) // that are bound to them in the SymbolTable.****************************************) } (else)if************************** (ast****************************.type******************************===(ASTNodes) ************************************************. ********************** (Identifier) **********************({ return****************************************** (SymbolTable) **************************************** (lookup)(
(ast).
****************************************** (name));
/ / if-then-else evaluates to the expression of the****************************************) // then clause if the condition is true, otherwise****************************************) // to the value of the else clause. (********************************} (****************************** (else) ****************************************************** (if) *********************** ()************************************ (ast) *************************************************************** (type) ***********************===ASTNodes ***********************. ****************** Condition)******************************************** {{**********************) if****************************************** ( (Eval) ************************************************ (ast
(**********************************************.condition
)){ return**************************************** (Eval)(
(ast).
(then));
****************************************** (else){
return**************************************** (Eval)(
(ast).
(el));
/ / The abstraction creates a new context of execution****************************************) // and registers it's argument in the SymbolTable.****************************************) } (else)if************************** (ast****************************.type******************************===(ASTNodes) ************************************************. ********************** (Abstraction) *********************************************** () ({) *********************** const****************************************** (scope)=
(new)Scope
(); return****************************************** (x)=>
{{************************) scope****************************************(add************************** (ast
(**********************************************.arg
(**********************************************.id
(**********************************************.name
****************************************, ************************ (x)); SymbolTable****************************************(push************************** (scope
); return**************************************** (Eval)
/ / IsZero checks if the evaluated value of its****************************************) // expression equals `0`.****************************************) } (else)if(
(ast).
(body));
;************************** (ast****************************.type******************************===(ASTNodes) ************************************************. ********************** (IsZero) ********************************************************** ({**********************) return**************************************** (Eval)(
(ast).
(expression))
********************************************===(0) ********************; / / The arithmetic operations manipulate the value****************************************
) // of their corresponding expressions:****************************************) // - `succ` adds 1.****************************************) // - `pred` subtracts 1.****************************************) } (else)if************************** (ast****************************.type******************************===(ASTNodes) ************************************************. **********************Arithmetic ({ const**************************************** (op)
=
(ast).
(operator);
const**************************************** (val)=
(Eval)(
(ast).
******************************************** (expression));
switch****************************************** ( (op){
case****************************************** 'succ':
return**************************************** (val) ( (1);
case****************************************** 'pred':
return****************************************** ( val-1
>=
0?************************************** (val-******************************************** (1) ************************ (********************************:val(**********************************************; / / The application evaluates to:****************************************) // - Evaluation of the left expression.****************************************) // - Evaluation of the right expression.****************************************) // Application of the evaluation of the left expression over****************************************) // the evaluated right expression.****************************************) } (else)if************************** (ast****************************.type******************************===(ASTNodes) ************************************************. ********************** (Application) ********************************************************* () ({) *********************** const**************************************** (l)=
(Eval)(
(ast).
****************************************** (left));
const**************************************** (r)=
(Eval)(
(ast).
****************************************** (right));
return**************************************** (l)(
(r));
return****************************************** (true);
;The code is quite straightforward. It involves pre-order traversal of the produced AST and interpretation of the individual nodes. For instance, in case given node in the tree represents a conditional expression all we need to do is check its condition and return the result we get from the evaluation of its
****************************************************************************** if (******************************** (****************************************** (Eval) ************************ (****************************************** (astthenor
elsebranches, depending on the condition's value:
******************************************************** condition)******************************************* ({ return**************************************** (Eval) (
(ast).
(then));
****************************************** (else){
return**************************************** (Eval)(
(ast).
(el));
(Developing a Code Generatorwhich compile to JavaScript. Why not create another language?Here's a (list of the languages
In fact, the transpilation (code generation) is going to be quite straightforward as well. The entire implementation of our “to JavaScript compiler” is on less than 40 lines of code. The entire transpiler can be found here
.
Let’s take a look at how we are going to translate application, abstraction and conditional expressions to JavaScript:
****************************************************************************** const (******************************** (CompileJS) **********************************************************=ast
==
{ (**********************************************. .. / / Transpile a literal****************************************) if (ast
**************************type****************************===ASTNodes********************************************.Literal (******************************** { return**************************************** (ast) **************************************** (value);
/ / Transpile identifier****************************************) } (else)if************************** (ast****************************.type******************************===(ASTNodes) ************************************************. ********************** (Identifier) **********************({ return**************************************** (ast) **************************************** (name);
/ / Transpile a conditional expression****************************************) } (else)if************************** (ast****************************.type******************************===(ASTNodes) ************************************************. ********************** Condition******************************************* ({ const****************************************** (targetCondition)=
CompileJS(
(ast).
****************************************** (condition));
const****************************************** (targetThen)=
CompileJS(
(ast).
******************************************** (then);
const targetElse******************************************== (CompileJS) ************************************************** (ast
(**********************************************.el
); return****************************************** ()($ {************************** (targetCondition)}****************************************** ($ {)targetThen**************************}: ($ {(targetElse) ************************}**************************************** (n` (********************************; / / Transpile an abstraction****************************************) } (else)if
************************** (ast****************************.type******************************===(ASTNodes) ************************************************. ********************** (Abstraction) *********************************************** () ({) *********************** return****************************************** ()$ { (ast).
****************************************** (arg).
**************************************** (id)**************************************** (name************ (**********************************************}=>{****************************************** (return) ($ {
**************************** (CompileJS)( (ast).******************************************** (body))****************************************}}) `;
/ / Transpile an application****************************************) } (else)if************************** (ast****************************.type******************************===(ASTNodes) ************************************************. ********************** (Application) ********************************************************* () ({) *********************** const**************************************** (l)=
CompileJS(
(ast).
****************************************** (left));
const**************************************** (r)=
CompileJS(
(ast).
****************************************** (right));
return****************************************** ()($ {************************ (l)} ($ {**************************************** (r)}
)******************************************** (n`) ************************ (********************************; ****************************************** (else)
{
return****************************************** (); ;Notice that when the current node is a literal (ie
(0 (*********************,true (orfalse) We return its value. In case we transpile a conditional expression, we first transpile its condition, then itsthen (expression, right after that itselse expression and finally, the entire conditional expression itself. The result will be a JavaScript ternary operator.
Right after that is the source code for transpilation of a function (or abstraction, the way we call a function in our definitions above). The source code is quite straightforward, we declare the function’s argument based on the argument of our lambda and after that compile function’s body and place it as the return statement.
Finally, we transpile the application. For this purpose, we transpile the left sub-expression of the application which is supposed to be a function and apply it to the right hand side of the application.
Now if we transpile the following program to JavaScript:
λ x: Nat → (λ y: Nat → x) 0 ) succ ( λ f: Nat → (λ g: Nat → g) 0 0
We will get:
****************************************************************************** (******************************** x=={ return****************************************** ( (y=={ return****************************************** (x);
(
(0));
(
**************************************** (f)=>
{{************************) return****************************************** ( (g) ********************** (g) ******************************************=={ return**************************************** (g);
(
(0));
(
(0))
( )1
;The purpose of this article was to explain a “full-stack” process of design and development of a programming language. The explained language is an extension of thelambda calculus. On top of the primitives provided by the lambda calculus we added three built-in functions, natural numbers, boolean values, a syntax for conditional expressions and a type system. We provided a formal definition of these primitives in terms of small-step semantics and also defined a type system by using a series of type rules.
It's interesting how (straightforward) is the JavaScript implementation which we directly derived from the mathematical definitions. We just “translated the math” into source code. Of course, the process of designing the source code has it’s own portion of creativity involved but the general algorithm is equivalent.
Another interesting observation is how similar are the algorithms for interpretation and type checking. With the type checking algorithm we performed type inference which was quite similar to the actual source code evaluation. Although with our basic type system this similarity still can be noticed, in case of (dependent types) the boundary between evaluation and type checking becomes even blurrier.
Where to go from here? (*******************************(*****************
A natural direction for further development of the language will be to extend its syntax and type system. In this case, there should be put some effort in improving the language’s ergonomics. There are a few programming languages inspired by the lambda calculus and there are two main “syntactic camps” that they are in. For instance, the LISP-like languages can be distinguished by their intensive use of parentheses. On the other hand, the Haskell-like syntax often looks cleaner and has its own adoptions in languages like Elm.
Another direction for improvement is a better error reporting. With the grammar produced by Peg.js we can easy get the exact position in the program of the individual symbols. This will allow us to throw errors at specific locations which will make debugging much easier.
Will be happy to get your thoughts and observations from the article in the comments section below.
(******************************************************************************** () ********************************************************************************** (************************************************************************************ Read More****************************************************************************** (**********************************************************************************
GIPHY App Key not set. Please check settings