form. a0, b0, c0 are initial values of the variables before block 1. Figure 19.4: Conversion of a program to static single-assignment form. Node 7 is a postbody node, inserted to make sure there is only one loop edge (see Exercise 18.6); such nodes are not strictly necessary but are sometimes helpful. Figure 19.5: Node 5 dominates all the nodes in the grey area. (a) Dominance frontier of node 5 includes the nodes (4, 5, 12, 13) that are targets of edges crossing from the region dominated by 5 (grey area including node 5) to the region not strictly dominated by 5 (white area including node 5). (b) Any node in the dominance frontier of n is also a point of convergence of nonintersecting paths, one from n and one from the root node. (c) Another example of converging paths P1,5 and P5,5. Figure 19.8: A control-Flow graph and trees derived from it. The numeric labels in part (b) are the dfnums of the nodes. Figure 19.11: Path compression. (a) Ancestor links in a spanning tree; AncestorWithLowestSemi(v) traverses three links. (b) New nodes a2, a3 are linked into the tree. Now AncestorWithLowestSemi(w) would traverse 6 links. (c) AncestorWithLowestSemi(v) with path compression redirects ancestor links, but best[v] remembers the best intervening node on the compressed path between v and a1. (d) Now, after a2 and a3 are linked, AncestorWithLowestSemi(w) traverses only 4 links. Figure 19.13: Conditional constant propagation. Figure 19.14: This transformation does not preserve the dominance property of SSA form, and should be avoided. Figure 19.15: Construction of the control-dependence graph. Figure 19.16: Aggressive dead-code elimination Figure 19.18: Functional intermediate representation. Binding occurrences of variables are underlined. Chapter 20: Pipelining and Scheduling Figure 20.1: Functional unit requirements of instructions (on the MIPS R4000 processor). This machine’s Floating-point ADD instruction uses the instruction-fetch unit for one cycle; reads registers for one cycle; unpacks exponent and mantissa; then for the next cycle uses a shifter and an adder; then uses both the adder and a rounding unit; then the rounding unit and a shifter; then writes a result back to the register file. The MULT and CONV instructions use functional units in a different order. Figure 20.2: If there is only one functional unit of each kind, then an ADD cannot be started at the same time as a MULT (because of numerous resource hazards shown in boldface); nor three cycles after the MULT (because of Add, Round, and Write hazards); nor four cycles later (because of Add and Round hazards). But if there were two adders and two rounding units, then an ADD could be started four cycles after a MULT. Or with dual fetch units, multiple-access register file, and dual unpackers, the MULT and ADD could be started simultaneously. Figure 20.3: Data dependence. (Above) If the MULT produces a result that is an operand to ADD, the MULT must write its result to the register file before the ADD can read it. (Below) Special bypassing circuitry can route the result of MULT directly to the Shift and Add units, skipping the Write, Read, and Unpack stages. Figure 20.7: Pipelined schedule. Assignments in each row happen simultaneously; each right-hand side refers to the value before the assignment. The loop exit test i < N + 1 has been "moved past" three increments of i, so appears as i < N - 2. Figure 20.8: Pipelined schedule, with move instructions. Figure 20.10: Iterative modulo scheduling applied to Program 20.4b. Graph 20.5a is the data- dependence graph; min = 5 (see page 451); H =[c, d, e, a, b, f, j, g, h]. Figure 20.11: Dependence of ADD's instruction-fetch on result of BRANCH.
We would like to recommend you tested and proved virtual web hosting services, which you will surely find to be of great quality.