To buy click here |
Overview of Compiler:
Source Program( Front End(Lexical Analysis(Token Stream), Syntax Analysis(Abstract Syntax
Tree),Semantic Analysis (Intermediate Code)) ),Backend(Code Generation(Target Program).
Input Source Program | Front End | Intermediate Code Generation | Code Generation | Output Program |
Requirements:)
• output code must be correct
• output code must be of high quality
• code generator should run efficiently
Design of code generator: There are some issues that have to tackle.
Input: Intermediate representation with the symbol table having assumed that input has been validated by the front end.
Target programs :
- Absolute machine language fast for small programs
- Relocatable machine code requires linker and loader
- Assembly code requires assembler, linker, and loader
Instruction selection: Uniformity, Completeness and Instruction speed, power consumption.
a=b+c
d=a+e
MOV b, R0
Add c, R0
MOV R0, a
MOV a, R0 (can be eliminated)
Add e, R0
MOV R0, d
a=a+1
Inc a
MOV a, R0
Add #1, R0
MOV R0,a
Register allocation:
- Instructions with register operands are faster
- Store long lifetime and counters in registers
- Temporary locations
- Even odd register pairs
Example Of Target Machine:
• Byte addressable with 4 bytes per word
• n registers R0, R1, ..., Rn-l
• Two address instructions of the form opcode source, destination
• Usual opcodes like move, add, sub, etc.
• Addressing modes are :
MODE | FORM | ADDRESS |
Absolute | M | M |
Resister | R | R |
Index | c(R) | c+content(R) |
Indirect Register | *R | content(R) |
Indirect Index | *c(R) | content(c+content(R)) |
Literal | #c | c |
Flow Graph:)
- Graph representation of three address code.
- Useful for understanding code generation(and for optimization)
- Nodes represent computation
- Edges represent the flow of control
Basic blocks
• (maximum) sequence of consecutive statements in which flow of control enters at the beginning and leaves at the end.
Algorithm to identify basic blocks :
• determine leader
– the first statement is a leader
– any target of a goto statement is a leader
– any statement that follows a goto statement is a leader
• For each leader its basic block consists of the leader and all statements up to the next leader 8
• add control flow information to basic blocks
• nodes are the basic blocks
• there is a directed edge from B1 to B2 if B2 can follow B1 in some execution sequence
– there is a jump from the last statement of B1 to the first statement of B2
– B2 follows B1 in the natural order of execution
• initial node: block with the first statement as a leader
Next use information:
• for register and temporary allocation
• remove variables from registers if not used
• statement X = Y op Z defines X and uses Y and Z
• scan each basic blocks backward
• assume all temporaries are dead on exit and all user variables are live on exit
Suppose we are scanning:
i : X:= Y op Z in the backward scan
1. attach to the statement i, the information in the symbol table about X, Y, Z
2. set X to “not live” and “no next use” in the symbol table.
3. set Y and Z to be “live” and next use as i in the symbol table
Example:)
1: t1 = a * a
2: t2 = a * b
3: t3 = 2 * t2
4: t4 = t1 + t3
5: t5 = b * b
6: t6 = t4 + t5
7: X = t6
7: no temporary is live
6: t6: use(7), t4 t5 not live
5: t5: use(6)
4: t4: use(6), t1 t3 not live
3: t3:use(4), t2 not live
2: t2:use(3)
1: t1:use(4)
Code Generator:
• consider each statement, remember if operands are in registers• Register descriptor:
– Keep track of what is currently in each register.
– Initially, all the registers are empty
• Address descriptor:
– Keep track of location where the current value of the name can be found at runtime
– The location might be a register, stack, memory address or a set of those
for each X = Y op Z do
• Invoke a function getreg to determine location L where X must be stored. Usually L is a register.
• Consult address descriptor of Y to determine Y'. Prefer a register for Y'. If value of Y not already in L generate
MOV Y', L
Generate op Z', L Again prefer a register for Z. Update address descriptor of X to indicate X is in L.
• If L is a register, update its descriptor to indicate that it contains X and remove X from all other register descriptors.
• If current value of Y and/or Z have no next use and are dead on exit from block and are in registers, change register descriptor to indicate that they no longer contain Y and/or Z.
Function getreg:
1. If Y is in register (that holds no other values) and Y is not live and has no next use after X = Y op Z then return register of Y for L.
2. Failing (1) return an empty register
3. Failing (2) if X has a next use in the block or op requires register then get a register R,store its content into M (by Mov R, M) and use it.
4. else select memory location X as L