[adjacent_group] In part 8 , we wrote some pretty interesting programs in our little language. We successfully expressed arithmetic and recursion. But there’s one thing that we cannot express in our language without further changes: an if (statement.)
Suppose we did not want to add a special [note:There is actually a slight difference between “mutual dependence”the way we defined it and “being in the same group”, andit lies in the symmetric property of an equivalence relation.We defined a function to depend on another function if it callsthat other function. Then, a recursive function depends on itself,but a non-recursive function does not, and therefore does notsatisfy the symmetric property. However, as far as we’re concerned,a function should be in a group with itself even if it’s not recursive. Thus, thereal equivalence relation we use is “in the same group as”, andconsists of “mutual dependence” extended with symmetry.]. (if / else) expression into our language. Thanks to lazy evaluation, we can express it using a function:
[vertex.first] defn if cte={ case c of { True -> {t} False -> {e} } } But an issue still remains: so far, our compiler remains monomorphic . That is, a particular function can only have one possible type for each one of its arguments. With our current setup, something like this would not work: [j,i] [vertex.first] (if (if True False True)) 3 This is because, for this to work, both of the following would need to hold (borrowing some of our notation from the typechecking post): [vertex_id] $$ text {if}: text {Bool} rightarrow text {Int} rightarrow text {Int} rightarrow text {Int} $$ $$ text {if}: text {Bool} rightarrow text {Bool} rightarrow text {Bool} rightarrow text {Bool} $$ But using our rules so far, such a thing is impossible, since there is no way for ( text {Int} ) to be unified with ( text {Bool} ). We need a more powerful set of rules to describe our program’s types. Hindley-Milner Type System One such powerful set of rules is the (Hindley-Milner type system) , which we have previously alluded to. In fact, the rules we came up with were already very close to Hindley-Milner, with the exception of two: generalization and (instantiation) . It’s been quite a while since the last time we worked on typechecking, so I’m going to present a table with these new rules, as well as all of the ones that we previously used. [note:
The rules aren't quite the same as the ones we used earlier;note that (sigma) is used in place of (tau) in the first rule,for instance. These changes are slight, and we'll talk about how therules work together below.] A transitive closure of a vertex in a graph is the list of all vertices reachablefrom that original vertex. Check out the Wikipedia page on this.] [note:A transitive closure of a vertex in a graph is the list of all vertices reachablefrom that original vertex. Check out the Wikipedia page on this.] of the function dependencies. We define a function (f ) to be dependent on another function (g ) if (f ) calls (g ). The transitive closure will help us find functions that are related indirectly. For instance, if (g ) also depends on (h ), then the transitive closure of (f ) will include (h ), even if (f ) directly does not use (h ).
We isolate groups of mutually dependent functions. If (f ) depends on (g ) and (g ) depends (f ), they are placed in one group. We then construct a dependency graph of these groups .
We compute a topological order of the group graph. This helps us typecheck the dependencies of functions before checking the functions themselves. In our specific case, this would ensure we check if first, and only then move on to testOne [name] and (testTwo) . The order of typechecking Within a group does not matter, as long as we generalize only after typechecking all functions in a group.
We typecheck the function groups, and functions within them, following the above topological order. [other_id] to find the transitive closure of a graph , we can use Warshall's Algorithm . This algorithm, with complexity (O (| V | ^ 3) ), goes as follows: $$ begin {aligned} & A, R ^ {(i)} in mathbb {B} ^ {n times n} \ & \ & R ^ {(0)} leftarrow A \ & textbf {for} ; k leftarrow 1 ; textbf {to} ; n ; textbf {do} \ & quad textbf {for} ; i leftarrow 1 ; textbf {to} ; n ; textbf {do} \ & quad quad textbf {for} ; j leftarrow 1 ; textbf {to} ; n ; textbf {do} \ & quad quad quad R ^ {(k)} [i,j] leftarrow R ^ {(k-1)} [i,j] ; textbf {or} ; R ^ {(k-1)} [i,k] ; textbf {and} ; R ^ {(k-1)} [k,j] \ & textbf {return} ; R ^ {(n)} end {aligned} $$ In the above notation, (R ^ {(i)} ) is the (i ) th matrix (R ), and (A ) is the adjacency matrix of the graph in question. All matrices in the algorithm are from ( mathbb {B} ^ {n times n} ), the set of (n ) by (n ) boolean matrices. Once this algorithm is complete, we get as output a transitive closure adjacency matrix (R ^ {(n)} ). Mutually dependent functions will be pretty easy to isolate from this matrix. If (R ^ {(n)} [i,j] ) and (R ^ {(n)} [j,i] ), then the functions represented by vertices (i ) and (j ) depend on each other. Once we’ve identified the groups, and
There is actually a slight difference between "mutual dependence"the way we defined it and "being in the same group", andit lies in the symmetric property of an equivalence relation.We defined a function to depend on another function if it callsthat other function. Then, a recursive function depends on itself,but a non-recursive function does not, and therefore does notsatisfy the symmetric property. However, as far as we're concerned,a function should be in a group with itself even if it's not recursive. Thus, thereal equivalence relation we use is "in the same group as", andconsists of "mutual dependence" extended with symmetry.] It is time to compute the topological order. For this, we will use Kahn's Algorithm [type_name] . The algorithm goes as follows: [var] $$ begin {aligned} & L leftarrow text {empty list} \ & S leftarrow text {set of all nodes with no incoming edges} \ & \ & textbf {while} ; S ; text {is non-empty} ; textbf {do} \ & quad text {remove a node} ; n ; text {from} ; S \ & quad text {add} ; n ; text {to the end of} ; L \ & quad textbf {for each} ; text {node} ; m ; text {with edge} ; e ; text {from} ; n ; text {to} ; m ; textbf {do} \ & quad quad text {remove edge} ; e ; text {from the graph} \ & quad quad textbf {if} ; m ; text {has no other incoming edges} ; textbf {then} \ & quad quad quad text {insert} ; m ; text {into} ; S \ & \ & textbf {if} ; text {the graph has edges} ; textbf {then} \ & quad textbf {return} ; text {error} quad textit {(graph has at least once cycle)} \ & textbf {else} \ & quad textbf {return} ; L quad textit {(a topologically sorted order)} end {aligned} $$ Note that since we've already isolated all mutually dependent functions into groups, our graph will never have cycles, and this algorithm will always succeed. Also note that since we start with nodes with no incoming edges, our list will begin With the groups that should be checked last . This is because a node with no incoming edges might (and probably does) still have outgoing edges, and thus depends on other functions / groups. Like in our successful example, we want to typecheck functions that are depended on first . (Implementation) Let’s start working on a C implementation of all of this now. First, I think that we should create a C class that will represent our function dependency graph. Let's call it function_graph . I propose the following definition:
(using) function = ::