Syntax Directed Translation - CSU086 - Shoolini U

Syntax Directed Translation

1. Syntax-Directed Program Evaluation

Concept: A syntax-directed definition (SDD) augments a context-free grammar with attributes on its symbols and semantic rules on its productions.

Key Points

  1. Define SDD: grammar + attributes + rules.
  2. Distinguish synthesized vs inherited.
  3. Explain dependency graph and topological sort.
  4. Show evaluation order (e.g. postorder for all synthesized).

Example: The simple desk-calculator SDD below evaluates expressions “E + T”, “T * F”, parentheses and digits. Each nonterminal has synthesized attribute val.

1) L → E n           L.val   = E.val  
2) E → E1 + T        E.val   = E1.val + T.val  
3) E → T             E.val   = T.val  
4) T → T1 * F        T.val   = T1.val * F.val  
5) T → F             T.val   = F.val  
6) F → ( E )         F.val   = E.val  
7) F → digit         F.val   = digit.lexval  

Here, digit.lexval is provided by the lexer. Attributes are computed in a bottom-up (postorder) traversal of the parse tree .

2. Different Translation Schemes & Their Implementation

Concept: Two complementary notations for syntax-directed translation:

  1. SDD (declarative): attributes & semantic rules on productions.
  2. SDT (operational): grammar with semantic actions {…} embedded.

Classes & Implementation:

L → E n    { print(E.val); }
E → E1 + T { E.val = E1.val + T.val; }
E → T      { E.val = T.val; }
T → T1 * F { T.val = T1.val * F.val; }
T → F      { T.val = F.val; }
F → ( E )  { F.val = E.val; }
F → digit  { F.val = digit.lexval; }

Each action fires when its production is reduced .

3. Intermediate-Code Generation

Concept: An intermediate language provides a machine-independent code form that sits between source and machine code. Benefits: easier optimization, retargeting, simpler back-ends. Forms of Intermediate Code:

Example (While-Statement SDD): String-valued attribute code generates labels and jumps:

S → while ( C ) S1
    { L1=new(); L2=new();
      C.true=L2; C.false=S.next;
      S1.next=L1;
      S.code = "label " L1  + C.code +
                "label " L2  + S1.code;
    }

C and S1 themselves generate C.code and S1.code similarly.

This SDD computes complete intermediate code for the loop, with new label generation e.g. L1, L2.

4. Syntax Trees

Concept: A syntax tree (abstract syntax tree) is a hierarchical, tree-structured representation of program constructs:

Why Use Them: Preserve program structure, simplify optimizations and code generation, avoid heavy parse-tree detail (like parentheses).

Example (Building Syntax Tree for “a – 4 + c”):

Grammar with semantic actions:

E → E1 – T   { E.node = new Node(‘–’, E1.node, T.node); }
E → E1 + T   { E.node = new Node(‘+’, E1.node, T.node); }
E → T        { E.node = T.node; }
T → id       { T.node = new Leaf(id, id.entry); }
T → num      { T.node = new Leaf(num, num.val); }
T → ( E )    { T.node = E.node; }

Resulting Tree: interior “+” with left child “–” node (children: “a”, 4) and right child “c” .

5. Three-Address Code Generation

Concept: A three-address instruction has at most one operator and up to three addresses (operands/results). Typically of form

x = y op z    or    x = op y

where x, y, z are variables, constants, or temporaries.

Key Points

  1. Define three-address code (TAC).
  2. Show use of temporary variables (t1, t2, …).
  3. Explain linear ordering; ease of generation.
  4. Mention it’s convenient for common subexpression elimination, etc.

Example: Translate a = b + c * d into TAC:

t1 = c * d  
t2 = b + t1  
a  = t2

6. Quadruples

Concept: A quadruple represents each TAC instruction as a 4-field record:

( op, arg1, arg2, result )

Key Points

  1. Field meanings.
  2. Easy to implement; result field holds destination.
  3. Good for code generation on stack machines or simple back-ends.

Example: For the TAC above:

( ’*’,  c,    d,   t1 )  
( ’+’,  b,    t1,  t2 )  
( ’=’,  t2,   –,   a  )  

7. Triples

Concept: A triple omits the result field by implicitly using the instruction’s index as its result:

[0] (*, c, d)  
[1] (+, b,  T0)    ← T0 refers to result of instruction [0]  
[2] (=, T1, a)     ← T1 refers to result of [1]  

Key Points

  1. No dedicated result field; refer by instruction number.
  2. Saves space but harder to reorder or optimize.
  3. Useful for simple intermediate representations.

8. Translation of Assignment Statements

Concept: For a production

S → id = E

we:

  1. Compute the address (or place) where E’s value ends up.
  2. Emit code for E, leaving its result in some location (E.place).
  3. Emit the final assignment to id.

Key Points

  1. Attribute E.place holds the variable or temporary with E’s value.
  2. S.code = E.code ‖ id = E.place.
  3. Example shows concatenation of code fragments. Example SDT Fragment:
S → id = E {  
      S.code = E.code + "\n" + id.name + " = " + E.place;  
    }
E → E1 + T {  
      E.place = newTemp();  
      E1.code ‖ T.code ‖ 
      E.place + " = " + E1.place + " + " + T.place  
    }
…  

This generates, for x = a + b, e.g.:

t1 = a  
t2 = b  
t3 = t1 + t2  
x  = t3