Syntax Directed Translation

1. Syntax-Directed Program Evaluation

Concept: A syntax-directed definition (SDD) augments a context-free grammar with attributes on its symbols and semantic rules on its productions.

Synthesized attributes flow up the parse tree (computed from children).
Inherited attributes flow down or across (computed from parent or left siblings).
To evaluate an SDD, we build an annotated parse tree and compute attributes in an order respecting the dependency graph (no cycles).

Key Points

Define SDD: grammar + attributes + rules.
Distinguish synthesized vs inherited.
Explain dependency graph and topological sort.
Show evaluation order (e.g. postorder for all synthesized).

Example: The simple desk-calculator SDD below evaluates expressions “E + T”, “T * F”, parentheses and digits. Each nonterminal has synthesized attribute val.

1) L → E n           L.val   = E.val  
2) E → E1 + T        E.val   = E1.val + T.val  
3) E → T             E.val   = T.val  
4) T → T1 * F        T.val   = T1.val * F.val  
5) T → F             T.val   = F.val  
6) F → ( E )         F.val   = E.val  
7) F → digit         F.val   = digit.lexval

Here, digit.lexval is provided by the lexer. Attributes are computed in a bottom-up (postorder) traversal of the parse tree .

2. Different Translation Schemes & Their Implementation

Concept: Two complementary notations for syntax-directed translation:

SDD (declarative): attributes & semantic rules on productions.
SDT (operational): grammar with semantic actions {…} embedded.

Classes & Implementation:

S-attributed → Postfix SDT: All attributes are synthesized; place actions at end of productions; implement during bottom-up (LR) parsing by executing actions on reduce.
L-attributed → Embedded SDT: Inherited attrs computed before the symbol, synthesized at end; implement during top-down (LL) parsing (recursive-descent or table-driven).

Example (Postfix SDT for Desk Calculator):

L → E n    { print(E.val); }
E → E1 + T { E.val = E1.val + T.val; }
E → T      { E.val = T.val; }
T → T1 * F { T.val = T1.val * F.val; }
T → F      { T.val = F.val; }
F → ( E )  { F.val = E.val; }
F → digit  { F.val = digit.lexval; }

Each action fires when its production is reduced .

3. Intermediate-Code Generation

Concept: An intermediate language provides a machine-independent code form that sits between source and machine code. Benefits: easier optimization, retargeting, simpler back-ends. Forms of Intermediate Code:

Syntax trees (high-level)
Three-address code (linear, 3 operands)
Quadruples, Triples (table representations)

Example (While-Statement SDD): String-valued attribute code generates labels and jumps:

S → while ( C ) S1
    { L1=new(); L2=new();
      C.true=L2; C.false=S.next;
      S1.next=L1;
      S.code = "label " L1  + C.code +
                "label " L2  + S1.code;
    }

C and S1 themselves generate C.code and S1.code similarly.

This SDD computes complete intermediate code for the loop, with new label generation e.g. L1, L2.

4. Syntax Trees

Concept: A syntax tree (abstract syntax tree) is a hierarchical, tree-structured representation of program constructs:

Interior nodes represent operators or syntactic constructs.
Leaf nodes represent operands (identifiers, constants).

Why Use Them: Preserve program structure, simplify optimizations and code generation, avoid heavy parse-tree detail (like parentheses).

Example (Building Syntax Tree for “a – 4 + c”):

Grammar with semantic actions:

E → E1 – T   { E.node = new Node(‘–’, E1.node, T.node); }
E → E1 + T   { E.node = new Node(‘+’, E1.node, T.node); }
E → T        { E.node = T.node; }
T → id       { T.node = new Leaf(id, id.entry); }
T → num      { T.node = new Leaf(num, num.val); }
T → ( E )    { T.node = E.node; }

Resulting Tree: interior “+” with left child “–” node (children: “a”, 4) and right child “c” .

5. Three-Address Code Generation

Concept: A three-address instruction has at most one operator and up to three addresses (operands/results). Typically of form

x = y op z    or    x = op y

where x, y, z are variables, constants, or temporaries.

Key Points

Define three-address code (TAC).
Show use of temporary variables (t1, t2, …).
Explain linear ordering; ease of generation.
Mention it’s convenient for common subexpression elimination, etc.

Example: Translate a = b + c * d into TAC:

t1 = c * d  
t2 = b + t1  
a  = t2

6. Quadruples

Concept: A quadruple represents each TAC instruction as a 4-field record:

( op, arg1, arg2, result )

Key Points

Field meanings.
Easy to implement; result field holds destination.
Good for code generation on stack machines or simple back-ends.

Example: For the TAC above:

( ’*’,  c,    d,   t1 )  
( ’+’,  b,    t1,  t2 )  
( ’=’,  t2,   –,   a  )

7. Triples

Concept: A triple omits the result field by implicitly using the instruction’s index as its result:

[0] (*, c, d)  
[1] (+, b,  T0)    ← T0 refers to result of instruction [0]  
[2] (=, T1, a)     ← T1 refers to result of [1]

Key Points

No dedicated result field; refer by instruction number.
Saves space but harder to reorder or optimize.
Useful for simple intermediate representations.

8. Translation of Assignment Statements

Concept: For a production

S → id = E

we:

Compute the address (or place) where E’s value ends up.
Emit code for E, leaving its result in some location (E.place).
Emit the final assignment to id.

Key Points

Attribute E.place holds the variable or temporary with E’s value.
S.code = E.code ‖ id = E.place.
Example shows concatenation of code fragments. Example SDT Fragment:

S → id = E {  
      S.code = E.code + "\n" + id.name + " = " + E.place;  
    }
E → E1 + T {  
      E.place = newTemp();  
      E1.code ‖ T.code ‖ 
      E.place + " = " + E1.place + " + " + T.place  
    }
…

This generates, for x = a + b, e.g.:

t1 = a  
t2 = b  
t3 = t1 + t2  
x  = t3