\documentclass[11pt]{article}
\usepackage[pdftex]{graphicx, color}
\usepackage{amsmath}
\usepackage{listings}
\usepackage{ebproof}
\usepackage[utf8]{inputenc}
\usepackage{listings}
\usepackage{inconsolata}
\usepackage[usenames,dvipsnames]{xcolor}
\usepackage[margin=1in]{geometry}
\usepackage{algorithmicx}
\usepackage{algpseudocode}
\usepackage{amsmath}
\usepackage{tikz}
\usepackage{mathtools}
\usepackage{proof}
\usepackage{graphicx}
\usepackage{amssymb}
\usepackage{hyperref}


\definecolor{color1}{HTML}{FFAABB}
\definecolor{color2}{HTML}{EEDD66}
\definecolor{color3}{HTML}{77AADD}
\definecolor{color4}{HTML}{EE6C6C}
\definecolor{color5}{HTML}{99DDFF}
\definecolor{color6}{HTML}{44BB99}
\definecolor{color7}{HTML}{BBCC33}
\definecolor{color8}{HTML}{AAAA00}
\definecolor{color9}{HTML}{DDDDDD}
 
\usetikzlibrary{positioning}
 
\algrenewcommand{\algorithmicforall}{\textbf{for each}}
 
\newcommand*{\lineref}[1]{$\langle$line #1$\rangle$}
\newcommand*{\objref}[1]{$\langle$object \texttt{#1}$\rangle$}
\newcommand*{\voidref}[1]{\texttt{void}}
\newcommand*{\so}{\textit{so}}
\newcommand*{\Int}{\mathrm{Int}}
 
\lstdefinelanguage{Cool}{
    morekeywords = {if,fi,pool,loop,while,else,class,inherits,case,in,of,esac,let,new,try,catch,yrt,throw},
    morekeywords = [2]{Int,Object,Value,MulOp,AddOp,Stack,Main,IO,SELF_TYPE},
    morecomment=[l]{--},
    morecomment=[n]{(*}{*)},
    morestring=[b]{"},
    literate={<-}{$\gets{}$}{2}{<=}{$\le{}$}{2}{=>}{$\Rightarrow{}$}{2},
}
\lstdefinestyle{mystyle}{
    numbers=left,numberstyle=\tiny,numbersep=10pt,
    commentstyle=\color{gray},
    keywordstyle=\bfseries,
    keywordstyle={[2]\color{BlueViolet}},
    stringstyle=\color{Mahogany},
    numberstyle=\color{darkgray},
}
\lstset{
    basicstyle=\ttfamily\small,
    tabsize=2,
    style=mystyle,
}

\newcommand\tab[1][1cm]{\hspace*{#1}}
\usepackage{array}
\newcommand{\PreserveBackslash}[1]{\let\temp=\\#1\let\\=\temp}
\newcolumntype{C}[1]{>{\PreserveBackslash\centering}p{#1}}
\newcolumntype{R}[1]{>{\PreserveBackslash\raggedleft}p{#1}}
\newcolumntype{L}[1]{>{\PreserveBackslash\raggedright}p{#1}}
%% \newcolumntype{L}[1]{>{\raggedright\let\newline\\\arraybackslash\hspace{0pt}}m{#1}}
%% \newcolumntype{C}[1]{>{\centering\let\newline\\\arraybackslash\hspace{0pt}}m{#1}}
%% \newcolumntype{R}[1]{>{\raggedleft\let\newline\\\arraybackslash\hspace{0pt}}m{#1}}

\usepackage{tikz}
\usetikzlibrary{graphs}
\usetikzlibrary{automata,positioning}

\begin{document}
\begin{center}
\LARGE YOURNAME -- SUNETID \\
\LARGE CS143 Spring 2026 -- Written Assignment 4 \\
\end{center}

This assignment covers code generation, operational semantics, and optimization. You may discuss this assignment with other students and work on the problems together. However, your write-up should be your own individual work, and you should indicate in your submission who you worked with, if applicable. Assignments can be submitted electronically through Gradescope as a PDF by Tuesday, June 2, 2026 at 11:59 PM PDT. A \LaTeX \ template for writing your solutions is available on the course website.

\bigskip

\begin{enumerate}

\item Programming in Assembly

Consider the following fictional hardware architecture consisting of two signed 64-bit integer registers \texttt{x}, \texttt{y}, and a large stack where each stack element holds a signed 64-bit integer.

The hardware architecture supports the following instructions:
\begin{itemize}
\item \texttt{push r}: Given register \texttt{r}, push the value of register \texttt{r} to the top of the stack.
\item \texttt{pop r}: Given register \texttt{r}, pop the top of the stack into register \texttt{r}.
\item \texttt{swap}: Swap the value at the top of the stack with the value right beneath it.
\item \texttt{add r1 r2}: Given two registers \texttt{r1}, \texttt{r2}, compute $r1 + r2$ and store it into register \texttt{r1}. 
\item \texttt{sub r1 r2}: Given two registers \texttt{r1}, \texttt{r2}, compute $r1 - r2$ and store it into register \texttt{r1}.
\item \texttt{mul r1 r2}: Given two registers \texttt{r1}, \texttt{r2}, compute $r1 \times r2$ and store it into register \texttt{r1}. 
\item \texttt{div r1 r2}: Given two registers \texttt{r1}, \texttt{r2}, compute $r1\ /\ r2$ and store it into register \texttt{r1}. 
\item \texttt{jz c}: Given signed 64-bit integer constant literal \texttt{c}, if the top of the stack is zero, jump to \texttt{c} instructions \textit{after} this instruction. Otherwise, execute the next instruction. For example, \texttt{jz 0} is a no-op instruction since the next instruction will be executed no matter the value of the stack top (unless the stack is empty, in which case the machine crashes). As another example, if the stack top is zero, \texttt{jz -1} would cause an infinite loop when executed, since it jumps to itself.
\item \texttt{print r}: Given register \texttt{r}, print its value to console.
\item \texttt{halt}: Stop execution.
\end{itemize}

All arithmetic operators have the same behavior as C \texttt{int64\_t} arithmetic, though you should normally not need to worry about this for this problem.

Your program starts with an empty stack, with register \texttt{x} holding an integer value $1\leq n\leq 19$, and register \texttt{y} holding zero. 

You should write a program that, when executed, prints out $n$ lines and halt, where the $i$-th (1-indexed) line should contain the value of the factorial of $i$ (for example, $4!=24$).

Your program should contain one instruction per line. You can also write comments: everything after a \texttt{\#} character in the same line will be treated as comments and discarded. In programming such an exotic architecture, comments will be very helpful for you to keep track of your program's design and invariants: make use of them!

You can find a simulator of the architecture at Stanford Myth location
\begin{verbatim}
    /afs/ir/class/cs143/bin/wa4_2026/cs143wa4p1
\end{verbatim}

On Stanford Myth, run
\begin{verbatim}
    /afs/ir/class/cs143/bin/wa4_2026/cs143wa4p1 <prog_file> <n>
\end{verbatim}
to run your program with register \texttt{x} holding value $n$. Alternatively, you can run
\begin{verbatim}
    /afs/ir/class/cs143/bin/wa4_2026/cs143wa4p1 <prog_file> <n> --trace
\end{verbatim}
to additionally print out the machine state after executing each instruction, which will be helpful for you to debug your program.

When you are confident with your program, run 
\begin{verbatim}
    /afs/ir/class/cs143/bin/wa4_2026/cs143wa4p1 <prog_file> test
\end{verbatim}
to run the staff's tests. If all tests passed, it will print out a secret string. \textbf{Be sure to include this secret string in your solution!}

\

\noindent{Solution: }

\noindent{The secret string is: }

\

\noindent{Your program:}

\newpage

\item Single Static Assignment (SSA) Form
 
Single static assignment (SSA) is an intermediate representation (IR) form used by virtually every production compiler today (LLVM, GCC, V8, HotSpot, and so on). Compared to the IR you saw in lecture, an SSA IR imposes two additional rules:
\begin{enumerate}
    \item Each variable is assigned \textit{exactly} once in the entire IR. \vspace{-0.5em}
    \item Every use of a variable must be preceded (statically) by its assignment.
\end{enumerate}
For instance, the IR basic block
\begin{align*}
    a &\coloneqq 5\\
    a &\coloneqq a - 1\\
    a &\coloneqq a \times a
\end{align*}
violates rule 1 because $a$ is assigned three times.
 
The first key idea for converting an IR to SSA form is straightforward: every time a variable is assigned, give it a new ``version'' (a fresh subscript), and rewrite every subsequent use to refer to the most recently created version. The block above becomes:
\begin{align*}
    \textcolor{blue}{a_1} &\coloneqq 5\\
    \textcolor{red}{a_2} &\coloneqq \textcolor{blue}{a_1} - 1\\
    a_3 &\coloneqq \textcolor{red}{a_2} \times \textcolor{red}{a_2}
\end{align*}
 
When the control flow graph (CFG) has the shape of a tree (no joins), this renaming strategy by itself is sufficient.
 
\textbf{(Task 2.A)} As a warm-up, convert the following IR to SSA form:
 
    \begin{center}
        \begin{tikzpicture}[bb/.style={text width=1.7in, draw, align=center}, node distance=0.4cm]
            \node[bb] (entry) {\textsc{Entry}};
            \node[bb,below=of entry] (b1) {$\begin{aligned}
                a &\coloneqq 5 \\
                b &\coloneqq 3 \\
                a &\coloneqq a - b \\
                b &\coloneqq b \times a \\
                c &\coloneqq a + b \\
            \end{aligned}$ \\
            \vspace{0.3em}
            \textbf{if} $b > a$
            \textbf{goto} B2
            \textbf{else goto} B3};
            \node[bb,below left=of b1] (b2) {$\begin{aligned}
                a &\coloneqq b + c \\
                b &\coloneqq a - c \\
                c &\coloneqq b \times b \\
            \end{aligned}$ \\
            \vspace{0.3em}
            \textbf{goto} B4};
            \node[bb,below right=of b1] (b3) {$\begin{aligned}
                c &\coloneqq a + b \\
                a &\coloneqq c - b \\
            \end{aligned}$ };
            \node[bb,below=of b2] (b4) {$\begin{aligned}
                b &\coloneqq a + c \\
                a &\coloneqq a \times b \\
            \end{aligned}$ };
            \node[left=0pt of b1.north west, anchor=north east] {B1};
            \node[left=0pt of b2.north west, anchor=north east] {B2};
            \node[right=0pt of b3.north east, anchor=north west] {B3};
            \node[left=0pt of b4.north west, anchor=north east] {B4};
            \draw[->,>=stealth] (entry) -- (b1);
            \draw[->,>=stealth] (b1.west) -| (b2.north);
            \draw[->,>=stealth] (b1.east) -| (b3.north);
            \draw[->,>=stealth] (b2) -- (b4);
        \end{tikzpicture}
        \end{center}
 
\newpage
 
Now that each variable in an SSA IR is assigned exactly once, this single assignment may naturally be regarded as the \textit{definition} of the variable. The second SSA rule then reads as: a variable must be defined before it is used.
 
This is a \textit{static} property of the IR: if variable $v$ is defined in basic block $X$ and used in basic block $Y$, then every path from the entry block to $Y$ in the CFG must go through $X$ (and if $X=Y$, the definition must lexically precede the use in the block). This means we are guaranteed, at compile time, that any use of $v$ is preceded by a valid assignment along every possible execution.
 
\textbf{(Task 2.B)} Consider the following IR (only the relevant statements are shown; the rest of each block has been omitted for clarity):
 
    \begin{center}
        \begin{tikzpicture}[bb/.style={text width=1.3in, draw, align=center}, bb2/.style={text width=2.2in, draw, align=center}, node distance=0.4cm]
            \node[bb] (entry) {\textsc{Entry}};
            \node[bb,below=of entry] (b1) {$\begin{aligned}
                &\\
                x_1 &\coloneqq \cdots \\
                &\\
            \end{aligned}$ };
            \node[bb,below left=of b1] (b2) {$\begin{aligned}
                &\\
                x_2 &\coloneqq \cdots \\
                & \\
            \end{aligned}$ };
            \node[bb,below right=of b1] (b3) {$\begin{aligned}
                &\\
                x_3 &\coloneqq \cdots \\
                &\\
            \end{aligned}$ };
            \node[bb,below=of b2] (b4) {$\begin{aligned}
                x_4 &\coloneqq\cdots \\
                y_1 &\coloneqq x_1\\
                y_2 &\coloneqq x_2\\
                y_3 &\coloneqq x_3\\
            \end{aligned}$ };
            \node[bb,below=of b3] (b5) {$\begin{aligned}
                x_5 &\coloneqq\cdots\\
                y_4 &\coloneqq x_1\\
                y_5 &\coloneqq x_3\\
                y_6 &\coloneqq x_4\\
            \end{aligned}$ };
            \node[bb2,below=of {b4.south |- b5.south}] (b6) {$\begin{aligned}
                x_6 &\coloneqq\cdots \\
                z_1 &\coloneqq x_1 \quad z_2 \coloneqq x_2 \\
                z_3 &\coloneqq x_3 \quad z_4 \coloneqq x_4 \\
                z_5 &\coloneqq x_5 \quad z_6 \coloneqq x_6 \\
            \end{aligned}$ };
            \node[left=0pt of b1.north west, anchor=north east] {B1};
            \node[left=0pt of b2.north west, anchor=north east] {B2};
            \node[right=0pt of b3.north east, anchor=north west] {B3};
            \node[left=0pt of b4.north west, anchor=north east] {B4};
            \node[right=0pt of b5.north east, anchor=north west] {B5};
            \node[left=0pt of b6.north west, anchor=north east] {B6};
            \draw[->,>=stealth] (entry) -- (b1);
            \draw[->,>=stealth] (b1.west) -| (b2.north);
            \draw[->,>=stealth] (b1.east) -| (b3.north);
            \draw[->,>=stealth] (b2) -- (b4);
            \draw[->,>=stealth] (b3) -- (b5);
            \draw[->,>=stealth] (b3.south west) -- (b4.north east);
            \draw[->,>=stealth] (b4.south) -- (b6.north);
            \draw[->,>=stealth] (b5.south west) -- (b6.north east);
        \end{tikzpicture}
        \end{center}
 
The CFG edges are: B1$\to$B2, B1$\to$B3, B2$\to$B4, B3$\to$B4, B3$\to$B5, B4$\to$B6, B5$\to$B6. Assume the definitions of $x_1, \ldots, x_6$ are all valid. Among $y_1,\ldots,y_6$ and $z_1,\ldots,z_6$, which variables are ill-defined in an SSA IR?
 
\newpage
 
Simply renaming each assignment to a fresh variable is not always enough to produce a valid SSA IR. As an example, consider the IR below:
 
        \begin{center}
        \begin{tikzpicture}[bb/.style={text width=1.5in, draw, align=center}, node distance=0.4cm]
            \node[bb] (entry) {\textsc{Entry}};
            \node[bb,below=of entry] (b1) {$\begin{aligned}
                x &\coloneqq 1
            \end{aligned}$ \\
            \textbf{if} $x > 0$ \textbf{goto} B2 \\
            \textbf{else goto} B3};
            \node[bb,below left=of b1] (b2) {$\begin{aligned}
                x &\coloneqq x + 1 \\
            \end{aligned}$ \\
            \textbf{goto} B4};
            \node[bb,below right=of b1] (b3) {$\begin{aligned}
                x &\coloneqq x + 2 \\
            \end{aligned}$ \\
            \textbf{goto} B4};
            \node[bb,below=of {b1.south |- b2.south}] (b4) {$\begin{aligned}
                y &\coloneqq x + 1 \\
            \end{aligned}$ };
            \node[left=0pt of b1.north west, anchor=north east] {B1};
            \node[left=0pt of b2.north west, anchor=north east] {B2};
            \node[right=0pt of b3.north east, anchor=north west] {B3};
            \node[right=0pt of b4.north east, anchor=north west] {B4};
            \draw[->,>=stealth] (entry) -- (b1);
            \draw[->,>=stealth] (b1.west) -| (b2.north);
            \draw[->,>=stealth] (b1.east) -| (b3.north);
            \draw[->,>=stealth] (b2.south) |- (b4.west);
            \draw[->,>=stealth] (b3.west) -| (b4.north);
        \end{tikzpicture}
        \end{center}
 
Applying the renaming process by itself produces:
 
    \begin{center}
        \begin{tikzpicture}[bb/.style={text width=1.5in, draw, align=center}, node distance=0.4cm]
            \node[bb] (entry) {\textsc{Entry}};
            \node[bb,below=of entry] (b1) {$\begin{aligned}
                x_1 &\coloneqq 1
            \end{aligned}$ \\
            \textbf{if} $x_1 > 0$ \textbf{goto} B2 \\
            \textbf{else goto} B3};
            \node[bb,below left=of b1] (b2) {$\begin{aligned}
                x_2 &\coloneqq x_1 + 1 \\
            \end{aligned}$ \\
            \textbf{goto} B4};
            \node[bb,below right=of b1] (b3) {$\begin{aligned}
                x_3 &\coloneqq x_1 + 2 \\
            \end{aligned}$ \\
            \textbf{goto} B4};
            \node[bb,below=of {b1.south |- b2.south}] (b4) {$\begin{aligned}
                y_1 &\coloneqq \textbf{\textcolor{red}{x??}} + 1 \\
            \end{aligned}$ };
            \node[left=0pt of b1.north west, anchor=north east] {B1};
            \node[left=0pt of b2.north west, anchor=north east] {B2};
            \node[right=0pt of b3.north east, anchor=north west] {B3};
            \node[right=0pt of b4.north east, anchor=north west] {B4};
            \draw[->,>=stealth] (entry) -- (b1);
            \draw[->,>=stealth] (b1.west) -| (b2.north);
            \draw[->,>=stealth] (b1.east) -| (b3.north);
            \draw[->,>=stealth] (b2.south) |- (b4.west);
            \draw[->,>=stealth] (b3.west) -| (b4.north);
        \end{tikzpicture}
        \end{center}
 
In block B4, the value of $x$ depends on which predecessor we came from, so neither $x_2$ nor $x_3$ alone is correct.
 
To handle this situation, SSA introduces \textit{$\phi$-nodes}. If block $B$ has $n$ direct predecessors $P_1, \ldots, P_n$, then a $\phi$-node in $B$ takes $n$ arguments $v_1, \ldots, v_n$; when control reaches $B$ from $P_i$, the $\phi$-node evaluates to $v_i$.
 
We write a $\phi$-node as $\phi(P_1:v_1, \ldots, P_n:v_n)$. As a convention, a $\phi$-node must always appear as the entire right-hand side of an assignment --- it cannot be combined with other operators in the same expression. For example, $x\coloneqq\phi(P_1:v_1,P_2:v_2)+1$ is not allowed, but you can split it as $x_1\coloneqq\phi(P_1:v_1,P_2:v_2);\ x_2\coloneqq x_1+1$. Furthermore, every $\phi$-node assignment in a block must appear before any non-$\phi$ assignment in that block.
 
$\phi$-nodes are also subject to a def-before-use rule, but in a slightly more refined way. For each $1 \leq i \leq n$, if $v_i$ is defined in basic block $X$, then every path from the entry block to $P_i$ must go through $X$. This way, whenever control flows from $P_i$ into $B$, the $v_i$ that gets selected by the $\phi$-node is guaranteed to have a defined value.
 
\textbf{(Task 2.C)} Consider the following IR (only relevant parts shown):
 
\begin{center}
        \begin{tikzpicture}[bb/.style={text width=1.3in, draw, align=center}, bb2/.style={text width=2.7in, draw, align=center}, node distance=0.4cm]
            \node[bb] (entry) {\textsc{Entry}};
            \node[bb,below=of entry] (b1) {$\begin{aligned}
                &\\
                x_1 &\coloneqq \cdots \\
                &\\
            \end{aligned}$ };
            \node[bb,below left=of b1] (b2) {$\begin{aligned}
                &\\
                x_2 &\coloneqq \cdots \\
                & \\
            \end{aligned}$ };
            \node[bb,below right=of b1] (b3) {$\begin{aligned}
                &\\
                x_3 &\coloneqq \cdots \\
                &\\
            \end{aligned}$ };
            \node[bb,below=of b2] (b4) {$\begin{aligned}
                &\\
                x_4 &\coloneqq\cdots \\
                &\\
            \end{aligned}$ };
            \node[bb,below=of b3] (b5) {$\begin{aligned}
                &\\
                x_5 &\coloneqq\cdots\\
                &\\
            \end{aligned}$ };
            \node[bb2,below=of {b4.south |- b5.south}] (b6) {$\begin{aligned}
                y_1 &\coloneqq \phi(\textrm{B4}:x_1, \textrm{B5}:x_1)\\
                y_2 &\coloneqq \phi(\textrm{B4}:x_2, \textrm{B5}:x_3)\\
                y_3 &\coloneqq \phi(\textrm{B4}:x_4, \textrm{B5}:x_5)\\
                y_4 &\coloneqq \phi(\textrm{B4}:x_3, \textrm{B5}:x_2)\\
                y_5 &\coloneqq \phi(\textrm{B4}:x_1, \textrm{B5}:x_3)\\
                y_6 &\coloneqq \phi(\textrm{B4}:x_5, \textrm{B5}:x_4)\\
                y_7 &\coloneqq \phi(\textrm{B4}:x_3, \textrm{B5}:x_3)\\
            \end{aligned}$ };
            \node[left=0pt of b1.north west, anchor=north east] {B1};
            \node[left=0pt of b2.north west, anchor=north east] {B2};
            \node[right=0pt of b3.north east, anchor=north west] {B3};
            \node[left=0pt of b4.north west, anchor=north east] {B4};
            \node[right=0pt of b5.north east, anchor=north west] {B5};
            \node[left=0pt of b6.north west, anchor=north east] {B6};
            \draw[->,>=stealth] (entry) -- (b1);
            \draw[->,>=stealth] (b1.west) -| (b2.north);
            \draw[->,>=stealth] (b1.east) -| (b3.north);
            \draw[->,>=stealth] (b2) -- (b4);
            \draw[->,>=stealth] (b3) -- (b5);
            \draw[->,>=stealth] (b3.south west) -- (b4.north east);
            \draw[->,>=stealth] (b4.south) -- (b6.north);
            \draw[->,>=stealth] (b5.south west) -- (b6.north east);
        \end{tikzpicture}
        \end{center}
 
The CFG edges are: B1$\to$B2, B1$\to$B3, B2$\to$B4, B3$\to$B4, B3$\to$B5, B4$\to$B6, B5$\to$B6. Assume the definitions of $x_1, \ldots, x_5$ are all valid. Which of $y_1, \ldots, y_7$ are ill-defined?
 
\newpage
 
With $\phi$-nodes we can now convert the earlier joining example to SSA form:
 
    \begin{center}
        \begin{tikzpicture}[bb/.style={text width=1.5in, draw, align=center}, bb2/.style={text width=2in, draw, align=center}, node distance=0.4cm]
            \node[bb] (entry) {\textsc{Entry}};
            \node[bb,below=of entry] (b1) {$\begin{aligned}
                x_1 &\coloneqq 1
            \end{aligned}$ \\
            \textbf{if} $x_1 > 0$ \textbf{goto} B2 \\
            \textbf{else goto} B3};
            \node[bb,below left=of b1] (b2) {$\begin{aligned}
                x_2 &\coloneqq x_1 + 1 \\
            \end{aligned}$ \\
            \textbf{goto} B4};
            \node[bb,below right=of b1] (b3) {$\begin{aligned}
                x_3 &\coloneqq x_1 + 2 \\
            \end{aligned}$ \\
            \textbf{goto} B4};
            \node[bb2,below=of {b1.south |- b2.south}] (b4) {$\begin{aligned}
                \textcolor{red}{x_4\ } & \textcolor{red}{\coloneqq \phi(\textrm{B2}:x_2, \textrm{B3}:x_3)}\\
                y_1 &\coloneqq x_4 + 1 \\
            \end{aligned}$ };
            \node[left=0pt of b1.north west, anchor=north east] {B1};
            \node[left=0pt of b2.north west, anchor=north east] {B2};
            \node[right=0pt of b3.north east, anchor=north west] {B3};
            \node[right=0pt of b4.north east, anchor=north west] {B4};
            \draw[->,>=stealth] (entry) -- (b1);
            \draw[->,>=stealth] (b1.west) -| (b2.north);
            \draw[->,>=stealth] (b1.east) -| (b3.north);
            \draw[->,>=stealth] (b2.south) |- (b4.west);
            \draw[->,>=stealth] (b3.west) -| (b4.north);
        \end{tikzpicture}
    \end{center}
 
Can we automate this conversion? If we are willing to be wasteful with the number of $\phi$-nodes we insert, there is a fairly simple procedure:
\begin{enumerate}
    \item Add a new basic block at the very start of the program; inside it, initialize every variable to \texttt{UNDEFINED}. Make this new block the entry block, with a single edge to what used to be the entry block.
    \item For every basic block $B$ other than the new entry block, and for every variable $x$, insert an assignment $x^\textrm{B}_1 \coloneqq \phi(P_1:\square, \ldots, P_n:\square)$ at the very top of $B$ (where $P_1, \ldots, P_n$ are the direct predecessors of $B$). The $\square$ slots will be filled in by step 4.
    \item Within each basic block, do the simple renaming from Task 2.A: each non-$\phi$ assignment gets a fresh left-hand-side name, and every variable on the right-hand side is rewritten to its most recent version (which always exists, thanks to the start-of-block $\phi$-nodes we just inserted).
    \item Go back to each $\phi$-node $x^\textrm{B}_1 \coloneqq \phi(P_1:\square, \ldots, P_n:\square)$ from step 2 and replace each $\square$ at slot $P_i$ with the most recent version of $x$ at the end of block $P_i$.
\end{enumerate}
 
As a worked example, the algorithm transforms the non-SSA IR on the left below into the SSA IR on the right:
 
\begin{figure}[!htb]
    \centering
    \begin{minipage}{.5\textwidth}
    \centering
        \begin{tikzpicture}[bb/.style={text width=1.5in, draw, align=center}, bb2/.style={text width=4in, draw, align=center}, node distance=0.4cm]
            \node[bb] (entry) {\textsc{Entry}};
            \node[bb,below=of entry] (b1) {$\begin{aligned}
                x &\coloneqq 1
            \end{aligned}$ \\
            \textbf{goto} B};
            \node[bb,below=of b1] (b2) {$\begin{aligned}
                x &\coloneqq x + 1 \\
            \end{aligned}$ \\
            \textbf{if} $x < 10$ \textbf{goto} B \\
            \textbf{else goto} C};
            \node[bb,below=of b2] (b3) {$\begin{aligned}
                x &\coloneqq x + 1 \\
            \end{aligned}$};
            \node[left=0pt of b1.north west, anchor=north east] {A};
            \node[left=0pt of b2.north west, anchor=north east] {B};
            \node[left=0pt of b3.north west, anchor=north east] {C};
            \draw[->,>=stealth] (entry) -- (b1);
            \draw[->,>=stealth] (b1) -- (b2);
            \draw[->,>=stealth] (b2) -- (b3);
            \draw[->,>=stealth] (b2) edge[loop right,in=10,out=-10,looseness=4] (b2);
        \end{tikzpicture}
   \end{minipage}%
   \begin{minipage}{0.5\textwidth}
        \centering
        \begin{tikzpicture}[bb/.style={text width=2in, draw, align=center}, bb2/.style={text width=4in, draw, align=center}, node distance=0.4cm]
            \node[bb] (entry) {\textsc{Entry}};
            \node[bb,below=of entry] (b0) {$\begin{aligned}
                x^\textrm{Z}_1 &\coloneqq \texttt{UNDEFINED}
            \end{aligned}$ \\
            \textbf{goto} A};
             \node[bb,below=of b0] (b1) {$\begin{aligned}
                x^\textrm{A}_1 &\coloneqq \phi(\textrm{Z}: x^\textrm{Z}_1)\\
                x^\textrm{A}_2 &\coloneqq 1
            \end{aligned}$ \\
            \textbf{goto} B};
            \node[bb,below=of b1] (b2) {$\begin{aligned}
                x^\textrm{B}_1 &\coloneqq \phi(\textrm{A}: x^\textrm{A}_2, \textrm{B}:x^\textrm{B}_2)\\
                x^\textrm{B}_2 &\coloneqq x^\textrm{B}_1 + 1 \\
            \end{aligned}$ \\
            \textbf{if} $x^\textrm{B}_2 < 10$ \textbf{goto} B \\
            \textbf{else goto} C};
            \node[bb,below=of b2] (b3) {$\begin{aligned}
                x^\textrm{C}_1 &\coloneqq \phi(\textrm{B}:x^\textrm{B}_2) \\
                x^\textrm{C}_2 &\coloneqq x^\textrm{C}_1 + 1 \\
            \end{aligned}$};
            \node[left=0pt of b0.north west, anchor=north east] {Z};
            \node[left=0pt of b1.north west, anchor=north east] {A};
            \node[left=0pt of b2.north west, anchor=north east] {B};
            \node[left=0pt of b3.north west, anchor=north east] {C};
            \draw[->,>=stealth] (entry) -- (b0);
            \draw[->,>=stealth] (b0) -- (b1);
            \draw[->,>=stealth] (b1) -- (b2);
            \draw[->,>=stealth] (b2) -- (b3);
            \draw[->,>=stealth] (b2) edge[loop right,in=10,out=-10,looseness=4] (b2);
        \end{tikzpicture}
    \end{minipage}
\end{figure}
 
\textbf{(Task 2.D)} Apply the algorithm to the following IR to produce an SSA IR. Make sure you follow the algorithm exactly as described.
 
\begin{center}
        \begin{tikzpicture}[bb/.style={text width=2in, draw, align=center}, bb2/.style={text width=1.7in, draw, align=center}, node distance=0.4cm]
            \node[bb] (entry) {\textsc{Entry}};
            \node[bb,below=of entry] (bA) {$\begin{aligned}
                a &\coloneqq 0 \\
                b &\coloneqq 1 \\
            \end{aligned}$ \\
            \textbf{goto} B};
            \node[bb,below=of bA] (bB) {$\begin{aligned}
                a &\coloneqq a + 1 \\
            \end{aligned}$ \\
            \textbf{goto} C};
            \node[bb,below=of bB] (bC) {$\begin{aligned}
                b &\coloneqq b \times 2 \\
            \end{aligned}$ \\
            $b < 100$ ? \textbf{goto} D :
            \textbf{goto} E};
            \node[bb2,left=1cm of bC] (bD) {$\begin{aligned}
                b &\coloneqq b + 1 \\
            \end{aligned}$ \\
            \textbf{goto} E};
            \node[bb,below=of bC] (bE) {$\begin{aligned}
                b &\coloneqq b + a \\
            \end{aligned}$ \\
            $b < 200$ ? \textbf{goto} C :
            \textbf{goto} F};
            \node[bb,below=of bE] (bF) {$\begin{aligned}
            \end{aligned}$
            $a < 10$ ? \textbf{goto} B :
            \textbf{goto} G};
            \node[bb,below=of bF] (bG) {$\begin{aligned}
                \textbf{print}(b)\\
            \end{aligned}$ };
            \node[right=0pt of bA.north east, anchor=north west] {A};
            \node[right=0pt of bB.north east, anchor=north west] {B};
            \node[right=0pt of bC.north east, anchor=north west] {C};
            \node[left=0pt of bD.north west, anchor=north east] {D};
            \node[right=0pt of bE.north east, anchor=north west] {E};
            \node[right=0pt of bF.north east, anchor=north west] {F};
            \node[right=0pt of bG.north east, anchor=north west] {G};
            \draw[->,>=stealth] (entry) -- (bA);
            \draw[->,>=stealth] (bA) -- (bB);
            \draw[->,>=stealth] (bB) -- (bC);
            \draw[->,>=stealth] (bC.west) -- (bD.east);
            \draw[->,>=stealth] (bD.south) |- (bE.west);
            \draw[->,>=stealth] (bC) -- (bE);
            \draw[->,>=stealth] (bE.east) .. controls +(2.2,0) and +(2.2,0) .. (bC.east);
            \draw[->,>=stealth] (bE) -- (bF);
            \draw[->,>=stealth] (bF.west) .. controls +(-2.5,0) and +(-2.5,0) .. (bB.west);
            \draw[->,>=stealth] (bF) -- (bG);
        \end{tikzpicture}
        \end{center}
 
The CFG edges are: Entry$\to$A, A$\to$B, B$\to$C, C$\to$D, C$\to$E, D$\to$E, E$\to$C, E$\to$F, F$\to$B, F$\to$G.
 
\textbf{(Task 2.E)} Explain why the algorithm always produces a valid SSA IR (not just for the example in Task 2.D, but in general). Specifically, argue that: every variable is defined exactly once, every variable satisfies the def-before-use property, and every $\phi$-node satisfies the def-before-use property.
 
\textbf{(Task 2.F)} Dead code elimination is straightforward on an SSA IR. Because SSA guarantees that each variable is assigned exactly once, and all uses must occur after the assignment, an assignment is dead code (and can safely be removed) precisely when its right-hand side has no side effect and its left-hand-side variable has no use anywhere in the IR. We can apply this rule repeatedly until no more assignments can be removed. Perform dead code elimination on the SSA IR you produced in Task 2.D. Which variables are dead? (You do not need to show the IR after elimination.)
 
Not every $\phi$-node inserted by the algorithm is actually necessary. For example, in the simple-loop SSA IR we produced earlier (reproduced for convenience on the left below), every use of $x^\textrm{C}_1$, which is defined as $x^\textrm{C}_1 \coloneqq \phi(\textrm{B}: x^\textrm{B}_2)$, can be replaced with $x^\textrm{B}_2$ without changing the program's behavior. Once we do so, $x^\textrm{C}_1$ has no remaining uses; since $\phi$-nodes have no side effects, dead code elimination removes the $\phi$-node assignment entirely. The right figure shows the simplified SSA IR after replacing uses of $x^\textrm{C}_1$ with $x^\textrm{B}_2$ and removing the dead variables $x^\textrm{Z}_1$, $x^\textrm{A}_1$, and $x^\textrm{C}_1$.
 
\newpage
 
\begin{figure}[!htb]
    \centering
   \begin{minipage}{0.5\textwidth}
        \centering
        \begin{tikzpicture}[bb/.style={text width=2in, draw, align=center}, bb2/.style={text width=4in, draw, align=center}, node distance=0.4cm]
            \node[bb] (entry) {\textsc{Entry}};
            \node[bb,below=of entry] (b0) {$\begin{aligned}
                x^\textrm{Z}_1 &\coloneqq \texttt{UNDEFINED}
            \end{aligned}$ \\
            \textbf{goto} A};
             \node[bb,below=of b0] (b1) {$\begin{aligned}
                x^\textrm{A}_1 &\coloneqq \phi(\textrm{Z}: x^\textrm{Z}_1)\\
                x^\textrm{A}_2 &\coloneqq 1
            \end{aligned}$ \\
            \textbf{goto} B};
            \node[bb,below=of b1] (b2) {$\begin{aligned}
                x^\textrm{B}_1 &\coloneqq \phi(\textrm{A}: x^\textrm{A}_2, \textrm{B}:x^\textrm{B}_2)\\
                x^\textrm{B}_2 &\coloneqq x^\textrm{B}_1 + 1 \\
            \end{aligned}$ \\
            \textbf{if} $x^\textrm{B}_2 < 10$ \textbf{goto} B \\
            \textbf{else goto} C};
            \node[bb,below=of b2] (b3) {$\begin{aligned}
                x^\textrm{C}_1 &\coloneqq \phi(\textrm{B}:x^\textrm{B}_2) \\
                x^\textrm{C}_2 &\coloneqq x^\textrm{C}_1 + 1 \\
            \end{aligned}$};
            \node[left=0pt of b0.north west, anchor=north east] {Z};
            \node[left=0pt of b1.north west, anchor=north east] {A};
            \node[left=0pt of b2.north west, anchor=north east] {B};
            \node[left=0pt of b3.north west, anchor=north east] {C};
            \draw[->,>=stealth] (entry) -- (b0);
            \draw[->,>=stealth] (b0) -- (b1);
            \draw[->,>=stealth] (b1) -- (b2);
            \draw[->,>=stealth] (b2) -- (b3);
            \draw[->,>=stealth] (b2) edge[loop right,in=10,out=-10,looseness=4] (b2);
        \end{tikzpicture}
    \end{minipage}%
    \begin{minipage}{0.5\textwidth}
        \centering
        \begin{tikzpicture}[bb/.style={text width=2in, draw, align=center}, bb2/.style={text width=4in, draw, align=center}, node distance=0.4cm]
            \node[bb] (entry) {\textsc{Entry}};
            \node[bb,below=of entry] (b0) {$\begin{aligned}
            \end{aligned}$
            \textbf{goto} A};
             \node[bb,below=of b0] (b1) {$\begin{aligned}
                x^\textrm{A}_2 &\coloneqq 1 \\
            \end{aligned}$ \\
            \textbf{goto} B};
            \node[bb,below=of b1] (b2) {$\begin{aligned}
                x^\textrm{B}_1 &\coloneqq \phi(\textrm{A}: x^\textrm{A}_2, \textrm{B}:x^\textrm{B}_2)\\
                x^\textrm{B}_2 &\coloneqq x^\textrm{B}_1 + 1 \\
            \end{aligned}$ \\
            \textbf{if} $x^\textrm{B}_2 < 10$ \textbf{goto} B \\
            \textbf{else goto} C};
            \node[bb,below=of b2] (b3) {$\begin{aligned}
                x^\textrm{C}_2 &\coloneqq x^\textrm{B}_2 + 1 \\
            \end{aligned}$};
            \node[left=0pt of b0.north west, anchor=north east] {Z};
            \node[left=0pt of b1.north west, anchor=north east] {A};
            \node[left=0pt of b2.north west, anchor=north east] {B};
            \node[left=0pt of b3.north west, anchor=north east] {C};
            \draw[->,>=stealth] (entry) -- (b0);
            \draw[->,>=stealth] (b0) -- (b1);
            \draw[->,>=stealth] (b1) -- (b2);
            \draw[->,>=stealth] (b2) -- (b3);
            \draw[->,>=stealth] (b2) edge[loop right,in=10,out=-10,looseness=4] (b2);
        \end{tikzpicture}
    \end{minipage}
\end{figure}
 
\textbf{(Task 2.G)} For the SSA IR you obtained in Task 2.F, eliminate as many $\phi$-nodes as possible using the technique just illustrated, and write down the resulting simplified SSA IR. (Hint: exactly four $\phi$-nodes should remain. Equivalently, you can also approach this by directly identifying which four $\phi$-nodes you want to \textit{keep}.) For ease of grading, please do not rename any variable that survives the elimination: all surviving variables should retain the names they had in Task 2.D.
 
Closing thoughts: this problem set has walked you through the definition of SSA and a complete, correct algorithm for converting a non-SSA IR into SSA form. The algorithm you executed in Task 2.D is, however, far from optimal --- both in running time and in the number of $\phi$-nodes inserted. In real compilers, the algorithm of choice is Cytron et al.'s fast SSA construction algorithm (introduced in 1989), which is dramatically faster and inserts far fewer $\phi$-nodes on real-world programs than our naive scheme\footnote{One can construct adversarial inputs on which Cytron's algorithm reaches the same theoretical worst-case complexity as our naive algorithm. In practice, however, what matters for a compiler is performance on real-world programs, not artificial pathological cases.}: when the input program uses only structured control flow (no \texttt{goto}/\texttt{break}/\texttt{continue}), Cytron's algorithm even runs in linear time in the size of the program (and as a corollary inserts at most $O(n)$ $\phi$-nodes). The path to that better algorithm starts with formalizing the observations you just made in Task 2.G --- namely, when is a $\phi$-node actually needed? --- and using appropriate data structures (most notably, the \textit{dominance frontier}) to detect those situations efficiently. The full development is beyond the scope of this problem set; if you are interested, the \href{https://en.wikipedia.org/wiki/Static_single-assignment_form}{Wikipedia page on SSA} is a good starting point.

\newpage

\item  Consider the following assembly-like pseudo-code, using 6 temporaries (abstract registers) $a$ to $f$: \\

\begin{lstlisting}
    a := b + d
    b := a + d
    d := a - b
    c := d + d
    if c > 100:
    c := c + d
    else:
    d := 1
    e := d - c
    f := e - c
\end{lstlisting}

\begin{enumerate}
    \item At each program point, list the variables that are live. Note that \textbf{b} and \textbf{d} are inputs for the given code and \textbf{f} is a live value on exit.

    \textbf{Solution:}

    \item Draw the register interference graph between temporaries in the above program as described in class.

    \textbf{Solution:}

    \item Provide a lower bound on the number of registers required by the program induced from the interference graph. Can you explain why?

    \textbf{Solution:}

    \item Using the algorithm described in class, provide a coloring of the graph in part(b). The number of colors used should be your lower bound in part (c). Provide the final k-colored graph (you may use the tikz package to typeset it or simply embed an image), along with the order in which the algorithm colors the nodes.

    \textbf{Solution:}

    \item Based on your coloring, write down a mapping from temporaries to registers (labeled r1, r2, etc.).

    \textbf{Solution:}

\end{enumerate}

\end{enumerate}
\end{document}