# CS 131: Programming Languages and Compilers

## Introduction

![Compiler](https://2670734141-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MYTnxRSQ6dvWqSwHSMR%2F-MZlyBWXtEHbgY2QG62j%2F-MZlzscUjQLJTLK8_9W0%2Fimage.png?alt=media\&token=7c0e6939-77fb-41e0-8f0a-157a58be6397)

### Definition

A **Compiler** is "A program that takes a source-code program and translates it into an equivalent program in target language"

### String

A **String** is a sequence of characters.

* Alphabet: A finite set of symbols （ASCII characters）
* String has words and sentences
* $$st$$ is the concatenation of $$s$$and $$t$$
  * $$s \epsilon = \epsilon s = s$$
* $$\epsilon$$is the empty string
  * $$s^0 = \epsilon$$
* $$\mid s \mid$$ is the length of s
* Suppose s is banana
* **Prefix:**&#x524D;缀
  * ban banana
* **Suffix:**&#x540E;缀
  * banana, ana
* **Substring:**&#x5B50;字符串
* **Subsequence**:子字符
  * bnan, nn

A **Language** is a set of Strings over a fixed **Alphabet** $$\Sigma$$, constructed using a specific Grammar.

* e.g. $${\varepsilon, 0,01,011,0111, \ldots}$$
* Not all Strings of chars in the Alphabet is in the certain Language, only those who satisfy the Grammar rules.
* Alphabet $$={0,1}$$ and using Grammar rule  $$\mathrm{RE}=01^{\*}$$ , we can specify the above example Language, while $$01$$ isn't.

Below is **Operations** and **Examples**.

![Operations of Languages](https://2670734141-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MYTnxRSQ6dvWqSwHSMR%2F-MZmje5MHI5A1UKpPn5X%2F-MZmlFc0EYCac_0nCzrW%2Fimage.png?alt=media\&token=012453a4-a65d-4703-a54c-0a7a9999fe72)

<div align="center"><img src="https://2670734141-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MYTnxRSQ6dvWqSwHSMR%2F-MZmmCBEEwzQxPV5ValK%2F-MZmmgRy6_bS-dr-YZks%2Fimage.png?alt=media&#x26;token=d15953f6-a978-49df-beff-e26302d793de" alt="Example of Laguage Operations"></div>

A **Grammar** $$G$$ is the description of method （*rules*） of how to construct a certain Language over a certain Alphabet

* Type 0: Turing Machine Recursive Enumerable Gramma
* Type 1: Context-sensitive Grammar （CSG）
* Type 2: Context-free Grammar (CFG,上下文无关文法）， *mostly recursive*
* Type 3: Right-linear Grammar Regular Expressions (RE,正则表达式）， *non-recursive*

{% hint style="info" %}
Expressiveness: Type 0 > Type 1 > Type 2 > Type 3
{% endhint %}

### Phases

![](https://2670734141-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MYTnxRSQ6dvWqSwHSMR%2F-MZzDqL4qrwEDK0ZORlu%2F-MZzDtySitutcAp8QaOk%2Fimage.png?alt=media\&token=04c429f7-2ba8-488b-9020-caf5b49f9a31)

## Lexical Analyzer （词法分析器）

Reads the source program character by character and returns the **tokens** of the source program

* 分析并把 identifiers 放入符号表
* 利用正则表达式来分析 tokens
* 利用有限自动机来分析 Token 以完成词法分析
* 词法分析器 = 读入（scanning） + 词法分析（lexical analysis）

![Lexical Analyzer](https://2670734141-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MYTnxRSQ6dvWqSwHSMR%2F-MZme14c35hIf4fxarxh%2F-MZmeSjHtfQ1-ogSHzxx%2Fimage.png?alt=media\&token=89698de4-1930-468b-95b3-6b336f9a0a0a)

### Token

Describes a pattern of characters having some meaning in the source program (such as identifiers, operators, keywords, numbers, delimiters and so on)

* e.g., identifiers, operators, numbers

A **Lexeme （词素）** is an instance of a Token, along with its unique attributes.

* e.g. `17`
  * INT&#x20;
  * Token.value = 17

![Process of determing a Token 通常使用双 buffers 来避免 buffer 区太小的问题](https://2670734141-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MYTnxRSQ6dvWqSwHSMR%2F-MZmf7knZA79BAj8yU7H%2F-MZmfxNa9GrrRaFbzRSZ%2Fimage.png?alt=media\&token=1cd38722-ab94-44ce-a6fc-40fbc00da044)

### Regular Expression （正则表达式）

我们利用**正则表达式**来 描述 **Tokens** 匹配 **Tokens**

* 每个**正则表达式** $$r$$ 描述一个语言 $$L(r)$$
  * $$r$$被叫做 **regular set**
* **正则表达式**是左结合的，从左向右匹配的
* **正则表达式**是一个 **Type-3 Grammar Rule**

#### **Properties**

![Properties](https://2670734141-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MYTnxRSQ6dvWqSwHSMR%2F-MZmoIwjc36glVplHPXq%2F-MZmof_B8q_1S9MeftXx%2Fimage.png?alt=media\&token=86b18754-2092-4685-b928-6195ed063107)

![Extended Properties](https://2670734141-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MYTnxRSQ6dvWqSwHSMR%2F-MZmoIwjc36glVplHPXq%2F-MZmojydvefc6fY_VzFT%2Fimage.png?alt=media\&token=79a2ddec-990c-4de5-8b81-b2c3f838e8b8)

* e.g.
  * Integers:
    * Digit = \[0-9]\*
    * Integer = Digit Digit\*
  * Identifier
    * letter = \[a-zA-Z]
    * identifier = letter （letter + digit）\*
  * Whitespace
    * WS = ('\n' + '\t' + ' ')+
* Examples
  * Even Binary number&#x20;
    * 1\[0+1]\*0 | 0

### Finite Automata （有限自动机）

#### Transition Diagram

![Transition Disgram Examples](https://2670734141-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MYTnxRSQ6dvWqSwHSMR%2F-MZmoyDn5jL5LqHm3c9r%2F-MZpxrkV0bNGzQ5kYj-e%2Fimage.png?alt=media\&token=7cb9e84d-cbbb-4d9b-9cbe-d0941b479af6)

当一个 Token 被识别：

* 如果是关键字，那么会返回 **Token of the keyword**
* 如果是符号表里的 ID，那么返回 **entry of symbol table**
* 如果符号表没有这个ID，那么加入ID并返回新的 **entry of symbol table**

```c
switch (state){
case 0:
    c = nextchar();
    if (c == [something])
        state = [state], lexeme_beginning ++;
    else if (c == [something])
        [do something]
    else if (c == [final state]) // terminal
        retract(1); //forward
        lexical_value = install_num(); // install something 
        return (NUM);
    else
        state = fail();
```

#### Finite automata

A **recognizer** for a language is a program that takes a string x, and answers “yes” if x is a sentence of that language, and “no” otherwise.

We call the recognizer of the tokens as a **finite automaton**.

**Example**

![Regular expression: (a+b)\*abb](https://2670734141-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MYTnxRSQ6dvWqSwHSMR%2F-MZq07KjiZwEnDCpiZhh%2F-MZq0pX6wcH7PWX7wenr%2Fimage.png?alt=media\&token=b1165377-3498-4b5c-9a84-7c5fbde5ef92)

* **Start Arrow**
* **State**
* **Transition Edge**
* **Accepting State**:同心圆，接受并结束，状态3
* **Death State:错误**状态，未定义的 transition 指向该状态
* Transition table:

| state | a     | b   |
| ----- | ----- | --- |
| 0     | {0,1} | {0} |
| 1     | --    | {2} |
| 2     | --    | {3} |

{% hint style="info" %}
注意 empty string 可以被某些自动机接收
{% endhint %}

### NFA

Non-Deterministic Finite Automata （NFAs） **easily** represent regular expression, but are **less precise.**&#x20;

**Accept s:** an Accepting State that spells out **s**.

An NFA is a mathematical model that consists of:

* S, a **finite** set of **states**
* $$\Sigma$$, the symbols of the **input alphabet**
* *move*, a **transition function**
  * move(state, symbol) $$\to$$ sets of states
* A state $$s\_0 \in S$$, the **start state**
* $$F \subseteq S$$, a set of **final** or **accepting states**
* $$\epsilon$$**move: ε- transitions** are allowed in NFAs. In other words, we can move from one state to another one without consuming any symbol

{% hint style="info" %}
The $$\epsilon$$-Closure of $$S = S \cup {$$ All States that can go to without consuming any input $$}$$
{% endhint %}

### DFA&#x20;

Deterministic Finite Automata （DFAs） require **more complexity** to represent regular expressions but offer **more precision**.

**Does not allow** $$\epsilon$$**move,** for every $$s \in S$$, there is ONLY ONE decision for every input *Symbol.*

**Accept s:** an Accepting State that spells out **s** *and **ONLY and ONLY ONE** path*.

{% hint style="info" %}
No $$\epsilon$$-closure !!!
{% endhint %}

### Implementation of Lexers

One Token, A *Recognizer. There are 4 ways. (r stands for RE)*

1. &#x20;$$r \to NFA \to Recognizer$$
2. $$r \to  NFA \to DFA \to Recognizer$$
3. $$r \to DFA \to Recognizer$$
4. $$r \rightsquigarrow DFA \to Minimized \~ DFA \to Recognizer$$

### RE to NFA

Algorithm is called **Thompson's Construction**.

![](https://2670734141-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MYTnxRSQ6dvWqSwHSMR%2F-MZq7HyBJi3ZBSa3828f%2F-MZqLle_HFS3AzK_CIg4%2Fimage.png?alt=media\&token=be807e8e-826e-4154-9fc2-19a558b7ec1a)

There are some requirements on such construction:

* $$N(s)$$and $$N(t)$$CANNOT have any intersections
* REMEMBER to assign unique names to all states

Properties of the resulting NFA:

* Exactly 1 Start State & 1 Accepting State
* $$#$$ of States in NFA  $$\leq 2 \times$$(# of Symbols + # of Operators) in $$r$$
* States do not have multiple outgoing edges with the same input symbol
* States have at most 2 outgoing $$\epsilon$$ edges

### RE to DFA

**\[Step 1]**: We make Augmented RE: concatenate with symbol # (meaning "finish").

* e.g. (a+b)\*a#
* Ensures at least one operator in the RE

**\[Step 2]:** Build syntax tree for this Augmented RE:

<div align="left"><img src="https://2670734141-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MYTnxRSQ6dvWqSwHSMR%2F-MZqMuu-JiuLpqZ2XTeH%2F-MZqNMnMdoqVzVVKS6jy%2Fimage.png?alt=media&#x26;token=8c5a4c5e-46c5-42a9-a54d-380a3fd96ef0" alt=""></div>

* $$\epsilon$$, # and $$a \in \Sigma$$ all are at leaves
* All other operators are inner nodes
* Non-$$\epsilon$$ leaves get its position number, increasing from left $$\to$$ right

**\[Step 3]:** Compute `nullable()`, `firstpos()` & `lastpos()` for ALL nodes.

1. `firstpos(n)`: Function returning the set of positions where the *first* Symbol can be at, in the *sub-RE* rooted at `n`
2. `lastpos(n)`: Function returning the set of Positions where the *last* Symbol can be at, in the sub-RE rooted at `n`
3. `nullable(n)`: Function judging whether the *sub-RE* rooted at `n` can generate $$\epsilon$$

![](https://2670734141-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MYTnxRSQ6dvWqSwHSMR%2F-MZqMuu-JiuLpqZ2XTeH%2F-MZqOknizrDpl08vfeyI%2Fimage.png?alt=media\&token=9c210892-0d91-45af-a63d-8715caf443e7)

**\[Step 4]:** Compute `followpos()` for Leaf positions

`followpos(i)`: Function returning the set of positions *which can follow* position i in the generated String

Conduct a *Post-order* *Depth First Traversal* on the syntax tree, and do the following oprations when leaving $$\cdot$$ / \* nodes:

* $$c{1} \cdot c{2}:$$ For all $$i \in$$ `lastpos(c1)` , `followpos(i)`= `followpo(i)`  $$\cup$$ `firstpos(c2)`
* $$c^{\*}:$$  For all $$i \in$$ `lastpos(c)`, `followpos(i)`$$=$$ `followpos(i)` $$\cup$$ `firstpos(c)`

**\[Step 5]:** Construct the DFA.

```c
void construct() {
    S0=firstpos(root);
    DStates= {(S0, unmarked)};
    while (DStates has an unmarked State U) {
        Mark State U;
        for (each possible input char c) 
        {
            V= {};
            for (each position p in U whose symbol is c)
                V=UnionofVandfollowpos(p);
            if (V is not empty) {
                if (V is not in DStates)
                    Include V in DStates, unmarked;
                Add the Transition U--c->V;
            }
        }
    }
}
```

* A State $$S$$in resulting DFA is an Accepting State iff # node $$\in S$$
* Start State of the resulting DFA is $$S\_0$$

#### Calculate \epsilon-Closure

```c
set epsClosure(set S) 
{
        for (each State s in S)
            Push s onto stack;
        closure = S;
        while (stack is not empty) 
        {
        Pop State u;
        for (each State v that u->v is an epsilon Transition) 
        {
              if (v is not in closure) 
              {
                  Include v in closure;
                  Push v onto stack;            
              }
         }    
         }
         return closure;
 }

```

#### Implement NFA as Recognizer

```c
bool  recognizer() {
    S=epsClosure(s0);
    while ((c=getchar()) !=EOF)
        S=epsClosure(move(S, c));
    if (S and F has intersections)
        return ACCEPT;
    return REJECT;
}
```

{% hint style="info" %}
Performance of NFA-type Recognizers: Space $$O(|r|)$$; Time $$O(|r| \times |s|)$$
{% endhint %}

#### Implement DFA as Recognizer

```c
bool recognizer() {
    s=s_0;
    while ((c=getchar()) !=EOF)
        s=move(s, c);
    if (s is in F)
        return ACCEPT;
    return REJECT;
}
```

{% hint style="info" %}
Performance of DFA-type Recognizers: Space $$O(|2^{|r|})$$; Time $$O(|s|)$$
{% endhint %}

#### Convert NFA to DFA

Algorithm is called **Subset Construction(子集构造法)**, since we make subset of States in original NFA into a single State in resulting DFA

```c
void subsetConstruction() {
    S0=epsClosure({s0});
    DStates= {(S0, unmarked)};
    while (DStates has any unmarked State U) {
        MarkState U;
        for (each possible inputchar c) {
            V=epsClosure(move(U, c));
            if (V is not empty) {
                if (V is not in DStates)
                    Include V in DStates, unmarked;
                Add the Transition U--c->V;
            }
        }
    }
}
```

* A State $$S$$ in resulting DFA is an Accepting State iff $$\exists s \in S, s$$ is an Accepting State in original NFA
* Start State of the resulting DFA is $$S\_0$$

### Minimize DFA

```c
void minimize() {
    PI = {G_A, G_n};
    do {
        for (every group G in PI){
            for (every pair of States (s,t) in G){
                if (for every possible input char c, transition s--c -> and t--c-> go to states in the same group)
                    s,t are in the same subgroup;
                else
                    s,t should split into different subgroups;
            }
            Split G according to the above information;
        }
    }while (PI changed in this iteration)
    Every Group in PI is a state in the minimal DFA;
}  
```

* A State S in the minimal DFA is an Accepting State iff $$\exists s \in S$$, s is an Accepting State in original DFA
* Start State of the minimal DFA is the one containing original Starting State

### Other Issues for Lexers

#### Look ahead

#### Comment Skip

#### Symbol Table

## Syntax Analyzer （句法分析）

如果说词法分析这一步提供了可供计算机识别的**词**，那么句法分析是为了理解句子结构。

通常这一步会生成 **parse tree**, **parse tree** 用以描述句法结构。

### Difference with Lexical Analyzer

* The syntax analyzer deals with **recursive** constructs of the language
* Both do similar things; But the lexical analyzer deals with simple **non-recursive** constructs of the language.
* The lexical analyzer recognizes the smallest meaningful units （**tokens**） in a source program.
* The syntax analyzer works on the smallest meaningful units （**tokens**） in a source program to recognize meaningful structures （**sentences**） in our programming language.

### Parse Tree Abstraction

A **Parse Tree / Syntax Tree (语法树)** is a graphical representation of the structure of a program, where leaf nodes are Tokens.

### CFG （上下文无关文法）

A **Context-free Grammar (CFG)** is a **Type-2** Grammar rule, which serves the construction of a Parse Tree from a streamof Tokens. We use a set of Production Rules to characterize a CFG

![](https://2670734141-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MYTnxRSQ6dvWqSwHSMR%2F-MZvPq1Ull1aBbvQCce3%2F-MZvRarzF4mg2I3Uq7KL%2Fimage.png?alt=media\&token=a5e91118-0fa1-4866-a47c-b7c8d0cfe671)

A **Terminal (终结符号)** is a Token; A **Non-terminal (非终结符号)** is a syntactic variable.

* The Start Symbol is the first one of Non-terminals; Usually represents the whole program
* A Sentence is a string of Terminals such that Start Symbol $$S \Rightarrow{ }^{+} s$$

A **Production Rule (生成规则)** is a law of production, from a Non-terminal to a sequence of Terminals & Non-terminals.

* e.g. $$A \rightarrow \alpha A \mid \beta$$, where $$A$$ is a Non-terminal and $$\alpha, \beta$$ are Terminals
* May be *recursive*
* The procedure of applying these rules to get a sentence of Terminals is called **Sentential Form** / **Derivation**

{% hint style="info" %}
$$|$$ Context-free Languages $$|>|$$ Regular Languages $$|$$, e.g. $${(^{i})^{i}: i  \geq  0 }$$.
{% endhint %}

### Derivation Directions（派生文法）\&Ambiguity（二义性）

**Left-most Derivation**  **(左递归)**$$\left(\Rightarrow\_{l m}\right)$$ means to replace the leftmost Non-terminal at each step.

* If $$\beta A \gamma \Rightarrow \operatorname{lm} \beta \delta \gamma$$, then NO Non-terminals in $$\mathcal{\beta}$$
* Corresponds to *Top Down Parsing*

**Right-most Derivation**  $$(\Rightarrow r m)$$means Replace the rightmost Non-terminal at each step.

* If $$\beta A \gamma \Rightarrow\_{r m} \beta \delta \gamma$$, then NO Non-terminals in $$\gamma$$
* Corresponds to *Bottom Up Parsing*, in reversed manner

![](https://2670734141-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MYTnxRSQ6dvWqSwHSMR%2F-MZzDqL4qrwEDK0ZORlu%2F-MZzG4b-kgANk-Vstyqs%2Fimage.png?alt=media\&token=64750911-a19f-4b06-b2fb-b884ea6f7ff6)

A CFG is **Ambiguous** when it produces more than one Parse Tree for the same sentence. Must remove Ambiguity for apractical CFG, by:

* Enforce *Precedence (优先级)* and *Associativity (结合律)*
  * &#x20;e.g. $$\* > +$$ , then $$+$$ gets expanded first
* Grammar Rewritten

![](https://2670734141-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MYTnxRSQ6dvWqSwHSMR%2F-MZzDqL4qrwEDK0ZORlu%2F-MZzFc923KMiyxDum2ST%2Fimage.png?alt=media\&token=8c414f8a-14dd-42e7-8c86-105244ea0b37)

### Top-Down Parsers

Construction of the parse tree starts at the root, and proceeds towards the leaves.

* Recursive Predictive Parsing
* Non-Recursive Predictive Parsing （**LL Parsing**）. （**L**-left to right; **L**-leftmost derivation）
* 语法构架能力弱

#### Implement

1. Eliminate Left Recursion $$\to$$ Recursive-descent Parsing
2. Eliminate Left Recursion $$\to$$ Left Factoring $$\to$$Recursive Predictive Parsing
3. Eliminate Left Recursion $$\to$$Left Factoring $$\to$$Construct Parsing Table $$\to$$Non-recursive Predictive Parsing

#### Left Recursion Elimination (消除左递归)

$$A \Rightarrow^{+} A\_{\alpha}$$: Left Recursion

* Top Down Parsing **CANNOT** handle Left-recursive Grammars
* Can be eliminated by rewriting

For *Immediate* Left Recursions (Left Recursion that may appear in a single step), eliminate by:

<div align="left"><img src="https://2670734141-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MYTnxRSQ6dvWqSwHSMR%2F-MZvUuLeF3i1evKnG0hj%2F-MZvVG9EhnihU_bPpd3e%2Fimage.png?alt=media&#x26;token=b81d927a-0e62-4a15-b824-46f92c885d6b" alt="立即左递归的消除"></div>

```c
/* Non-terminals arranged in order: A1, A2, ... An. */
void eliminate() 
{
    for (i from 1 to n) {
        for (j from 1 to i-1)
            Replace Aj with its products in every Prodcution Rule Ai->Aj ...;
        Eliminate Immediate Left Recursions Ai->Ai ...;    
    }
}
```

![左递归的消除](https://2670734141-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MYTnxRSQ6dvWqSwHSMR%2F-MZzDqL4qrwEDK0ZORlu%2F-MZzH1KO5UIk5U0j1bRY%2Fimage.png?alt=media\&token=f4567640-f402-4808-acb2-4a8727276b24)

#### Implementing Recursive-descent Parsing

```c
/*  Example:
*    E -> T | T + E*    
     T -> int | int * T | ( E )
*/
bool term(TOKENtok)  { return*ptr++==tok; }
bool E1()             { returnT(); }
bool E2()             { returnT() &&term(PLUS) &&E(); }
bool E() {
     TOKEN*save=ptr;
     return (ptr=save, E1()) || (ptr=save, E2());
}
bool T1()             { returnterm(INT); }
bool T2()             { returnterm(INT) &&term(TIMES) &&T(); }
bool T3()             { returnterm (OPEN) &&E() &&term(CLOSE); }
bool T() {
     TOKEN*save=ptr;
     return (ptr=save, T1()) || (ptr=save, T2()) || (ptr=save, T3());
}
```

#### Left Factoring: Produce LL(1) Grammar

LL(1) means Only 1 Token Look-ahead ensures which Pruduction Rule to expand now.

To convert LL(1)  to a CFG, for each Non-terminal :

<div align="left"><img src="https://2670734141-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MYTnxRSQ6dvWqSwHSMR%2F-MZvWvSTpmkArL3fxW06%2F-MZvWxiHwtfByfQVj4Kf%2Fimage.png?alt=media&#x26;token=60318a52-11d7-4351-87ba-a79fe015370a" alt=""></div>

{% hint style="info" %}
|| LL(1) || < || CFG ||, so not all Grammar can be convert to LL(1)

* Such Grammar will have an entry with multiple Production Rules to use in the Parsing Table, thusWill be inappropriate for Predictive Parsing
  {% endhint %}

#### Implementing Recursive Predictive Parsing

{% hint style="info" %}
This part stongly suggest to see  <https://www.josehu.com/assets/file/compilers.pdf> for better understanding.
{% endhint %}

![](https://2670734141-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MYTnxRSQ6dvWqSwHSMR%2F-MZzHg0KoCL2HZ5rTACt%2F-MZzIAL51Xb-Tx5XjVz8%2Fimage.png?alt=media\&token=36a0a442-c08b-42c0-be81-0b239c012b94)

#### Parsing Table Construction

{% hint style="info" %}
This part stongly suggest to see  <https://www.josehu.com/assets/file/compilers.pdf> for better understanding.
{% endhint %}

#### Implementing LL(1) Parsing

{% hint style="info" %}
This part stongly suggest to see  <https://www.josehu.com/assets/file/compilers.pdf> for better understanding.
{% endhint %}

### Bottom-Up Parsers

Construction of the parse tree starts at the leaves, and proceeds towards the root.

* Bottom-up parsing is also known as **shift-reduce parsing**
* **LR Parsing** – much general form of shift-reduce parsing: **LR**, **SLR**, **LALR** (**L**-left to right; **R**-rightmost derivation)

{% hint style="info" %}
This part stongly suggest to see  <https://www.josehu.com/assets/file/compilers.pdf> for better understanding.
{% endhint %}

## IR

## Thanks

* \[Guanzhou HU's Notes]\(<https://www.josehu.com/>)
* TAs: 季杨彪, 杨易为, 尤存翰
