首页 > 美文阅读

Y-branches When you come to a fork in the road, take it

更新时间:2023-05-22 03:53:43 阅读：评论：0

脸肿了怎么回事

Y-Branches:When You Come to a Fork in the Road,Take It

Nicholas Wang Michael Fertig Sanjay Patel

Center for Reliable and High-Performance Computing

Department of Electrical and Computer Engineering

University of Illinois at Urbana-Champaign

nwang,fertig,sjp@crhc.uiuc.edu

Abstract

In this paper,we study the effects of manipulating the

architected direction of conditional branches.Through the

u of statistical sampling,weﬁnd that about40%of all dy-

namic branches and about50%of mispredicted branches

do not affect correct program behavior when forced down

the incorrect path.We call such branches Y-branches.

To further examine this unexpected phenomenon,we

巴金随想录provide a characterization of the coding constructs that

give ri to such branches.Examples of such coding con-

structs include short-circuits and ineffectual loop itera-

tions.We provide a statistical breakdown of the frequency

of the branches and their constructs.Finally,we suggest

some techniques for exploiting this behavior,particularly

when it results from short-circuit constructs.

1.Introduction

Control speculation enables high-performance pro-

cessing for control intensive applications.Ultimately,

though,predictability becomes a limiting factor in improv-

ing processor performance:at some level,improving the

performance of a single thread of execution can be tied to

improving the processor’s ability to accurately predict and

rapidly evaluate branches.

In this paper,we investigate a surprising property of

certain dynamic branches.Using fault injection,weﬁnd

that about40%of all dynamic conditional branches are

outcome-tolerant.That is,the behavior of the application

is unaffected by whether the particular branch instance is

taken or not.Dynamic branches that are mispredicted ex-

hibit a larger degree of outcome-tolerance—about half of

all mispredicted branches can proceed down an incorrect

architectural path and result in correct execution.

2.Methodology

In order to characterize Y-branches,a method for dis-covering them is necessary.Toﬁnd all Y-branches for our input ts would require testing all instances of every static branch executed in each benchmark.Since it is computa-tionally expensive to perform this complete analysis,we u statistical sampling.This ction describes our experi-mental tup as well as the applications and infrastructure ud to perform the identiﬁcation process.

2.1.Performing Injection

For each experiment,we randomly sample1000con-ditional branches1.To determine whether a branch is outcome-tolerant,weﬁrst choo a single conditional branch instance.Then,when the chon branch is encoun-tered at runtime,the execution of the program is forced down the alternate path.That is,if the conditional branch was resolved to be taken,then the execution of the pro-gram is forced down the fall-through path and vice versa. Exactly one injection is made per benchmark execution,so the effects of each of the injections are isolated from one another.Each of the benchmark executions constitute a single trial,and each experiment is compod of1000 trials for each benchmark.Only conditional branches are targeted in our study—all other control instructions are ex-ecuted as deﬁned by architectural state.

After each injection is performed,the execution of the program is monitored.We call the execution of the manipulated program the injected simulation.To determine the effects of the injection,we compare the injected simu-lation with a reference simulation of the program that is generated without injection.All of architectural state(PC, registers,and memory)is frequently compared against the reference simulation to identify if a reconvergence point exists.The discovery of this point guarantees identical ex-ecution thereafter as well as conﬁrming the chon branch is outcome-tolerant.

Most applications do not execute in isolation,so it is important that all external communication is equi

可爱的孩子valent with that of the reference simulation.The executed applica-tions perform all external communication via system calls. While system calls contain aﬁnite t of inputs,we treat them as barriers for veriﬁcation.That is,the injected pro-gram must reconverge to the original program’s architected state prior to the execution of any system call.This re-striction reduces reconvergence likelihood,however,it also reduces simulation time and complexity.

Benchmark Dynamic Static Footprint

Count System Calls(instructions) bzip228.4M8543

625M1082318

eon10.4M49268

501M25252397

gcc33.6M164167

869M32828

mcf53.9M8198

513M10702390 perlbmk11.8M34472

595M4032896 vortex24.0M82037

534M1022116

Table1.Benchmark statistics.

现在什么生意好做ject out of the t of all dynamic branches.The random

lection is done before the trial begins by generating a

pudo-random number between one and the number of dy-

namic conditional branches in the particular execution of a

benchmark.

Outcome-tolerant mispredicted branches are deter-

mined by restricting the t of dynamic branches to tho

that are mispredicted by an18-bit gshare conditional

branch predictor[9].Again,the random determination is

done before the trial begins.

Figure1plots the outcome-tolerance rates of the three

experiments.Averaged across all benchmarks,approxi-

mately30%of the instances of an individual static branch

are outcome-tolerant2.Approximately40%of all dynamic电热水器怎么选

branches are outcome-tolerant,indicating that dynamically

frequent branches tend to exhibit more tolerance than infre-

quent branches.Finally,50%of dynamically mispredicted

branches exhibit outcome-tolerance,indicating that half of

all mispredicted conditional branches might be resilient to

being mispredicted.

The signiﬁcant ri in Y-branches in the ca of

mispredicted branches is particularly pronounced for the

benchmarks bzip2and gzip.The benchmark bzip2contains

an unrolled loop that accounts for a small portion of the

dynamically executed instructions.However,the loop end-

ing branches of the unrolled iterations also account for a

large portion of mispredictions.The branches can safely

perform early exits since the clean-up loop will perform

the remaining loop operations.Similarly,the increa in

outcome-tolerance in the benchmark gzip is due to three

branches that also account for a small portion of the pro-

gram execution,but reprent approximately half of all

mispredictions and are highly outcome-tolerant.The loop

structions.Eventually,if the injected branch is a Y-branch,the two streams converge and execute the same instructions for the remainder of the program—we term this control re-convergence .The span in instructions between the point of injection and the control reconvergence point in the in-jected execution is what we call the control reconvergence latency,which is labeled in the ﬁgure.This same pa-rameter with respect to the reference execution is labeled .Zero or more instructions later,if the injected branch is a Y-branch,both executions arrive at the same architectural state,or full state reconvergence .The full state reconver-gence latency is labeled in the ﬁgure.

There is actually another interesting point:live state reconvergence ,that is some of the full state of the running process is actually dead and therefore not required to be equivalent between the two executions.Live state recon-vergence will always occur at or before the two executions reach full stat

e reconvergence.In addition,live state recon-vergence is possible in the abnce of full state reconver-gence.Thus,the t of branches that are outcome-tolerant with respect to live state is a supert of the t of branches that are outcome-tolerant with respect to complete state.Unfortunately,live state reconvergence is difﬁcult and ex-pensive to measure precily,so we do not investigate it here.

Injected Execution

Reference Execution Figure 2.Comparison of reference versus in-jected executions.

We provide data on the span (in instructions)between the injection of a Y-branch and the subquent control re-convergence (with respect to the injected simulation),in Figure 3.Becau the distributions of the data are very wide,we have only plotted the middle 80%of the ,we exclude the upper and lower 10th percentiles.For example,80%of Y-branches in bzip2tend to control converge b

etween approximately 10to 100instructions.The median Y-branch in bzip2reconverges on control in 41instructions,as indicated by the diamond along the vertical line.In our graphs with medians,the average bar repre-nts an averaging across all other percentile bars.With this in mind,the average benchmark reaches control recon-vergence in approximately 18instructions.

Eon differs slightly from the other benchmarks in that many of its Y-branches reconverge on control immediately

after the injection.The reason for this is that about a third of the Y-branches are from a single loop-controlling branch.Injecting this branch usually results in an early loop exit which then in turn often results in immediate control reconvergence.Despite the short control reconver-gence latency,eon ’s full state reconvergence latency is rel-atively slow.We believe this occurs becau the early loop exits skip a number of dead stores,where the dead values they would have written have a relatively long

lifetime.银行风险

b z i p

c r a f t y

e o

唐狡g a

g c c

g z i p

c f

p a

r s e r

p e

r l b m k

t w o l

v o r t e

x v p r a v e r a g

Figure 3.Control reconvergence latency in instructions.

Figure 4plots the full state reconvergence spans in instructions for the middle 80%of Y-branches for every benchmark.Notice that full state reconvergence spans are much wider ranges than for control reconvergence,ranging from 1instruction for a Y-branch in perlbmk to over 3M instructions on the benchmark vpr .Across all benchmarks,the median Y-branch reaches full state reconvergence after approximately 1300

instructions.

110100100010000

100000100000010000000b z i p 2

c r a f t y

e o n

g a p

g c c

g z i p

c f

p a r s e r p e r l b m k

t w o l

v o r t e x

v p r

a v e r a g e

关于对联的故事

Figure 4.Full state reconvergence latency in instructions.

Figure 5prents the difference in the number of in-structions executed between the two ,the value of in Figure 2.Becau of the nature of the branch being injected,sometimes the injected execution runs for fewer instructions than the reference execution,

sometimes longer (we explain this in Section 3.3).The ranges are provided in this ﬁgure (positive numbers indi-cate that the injected execution ran for fewer instructions).In the median ca for the average benchmark,the injected execution ran for nearly the same number of instructions as the reference trace.Somewhat interesting is the fact that in roughly half the cas,the injected execution required fewer instructions than the reference execution to arrive at the

of -200

-160-120-80-40040801201602003.3.Sources of Outcome Tolerance

Outcome-tolerant branches are ultimately due to re-dundancies introduced by the programmer or compiler.The redundancies occur in various forms,such as super-ﬂuous loop iterations,performance-related branches,short-circuits,and the occasional irrelevant decision.Most of the Y-branches are a result of partially dead control,meaning that the branches are outcome-tolerant subject to some but not necessarily all initial states.In this ction,we perform a careful disction of a sampling of outcome-tolerant branches in order to ﬁnd the underlying reasons behind this phenomena.

What we desire is an ability to examine the control ﬂow graph and possibly the source code of a detected Y-branch in order to characterize the coding constructs that give ri to its outcome-tolerance.To do this automatically,we examine the differences in execution between the ref-erence execution and the injected execution.That is,re-ferring to Figure 2,we examine the differences in PCs ex-ecuted between the shaded regions of each execution,up until the point of control reconvergence.

Two basic situations ari when we examine the differ-ence between the injected and reference exe

cutions,as rep-rented in Figure 6.The leftmost subﬁgure reprents the

Disjoint ca.Here,the injected execution and the refer-ence execution execute completely different static instruc-tions.This

situation is primarily due to

the Y-branch being an if-el construct,where the decision made by the particular dynamic instance of the branch is irrelevant (we provide a speciﬁc example of this at the end of this ction).Sometimes the reference execution executes more instruc-tions;sometimes the injected execution does.

Execution A

Execution B

Execution A

Execution B

Disjoint Ca Skip Ca

Figure 6.General structure for difference analysis.

The other basic situation is reprented in the right subﬁgure.Here,one of the executions skips over instruc-tions executed by the other starting immediately after the injection point.That is,Execution A might execute blocks X,Y ,Z

before hitting the control reconvergence point whereas Execution B ﬂows directly to the reconvergence point immediately after injection.For example,if the injec-tion targets a Y-branch that is a loop-ending branch causing the loop to either terminate early or execute additional iter-ations,then the Y-branch is of the Skip category.

X;}Figure 7.The branch at the end of block A is potentially a Y-branch of the Skip variety.Skip-type Y-branches can be further classiﬁed bad on whether loops are involved.If looping is not involved (i.e.,none of the branches in the shaded region are back-wards branches),the code construct responsible for the outcome-tolerance is most probably of the short-circuit va-riety.For example,Figure 7demonstrates the control ﬂow that results from a simple short-circuit expression in C.The branch at the end of block A is an early jump to block X

10%20%30%40%

50%

60%

70%

80%b z i p 2

c r a f t y

e o n

g a p

g c c

g z i p

c f

p a r s e r p e r l b m k

t w o l f

v o r t e x

v p r

a v e r a g e

[Other] Unclassified

[Skip] Complex

Figure 8.Categorization of Y-branches for mispredicted branches.

if it is determined that A is true.This branch is potentially a Y-branch of the Skip variety when conditio

n B is true and B has no side-effects during evaluation.We call Y-branches of this category Skip/Simple branches.Note that some nested if constructs that are not short-circuits also fall into this category.

If the Skip-type Y-branch involves a backwards branch,then we classify the branch as either a Skip/Loop or Skip/Complex branch.Skip/Loop branches are cas where we are conﬁdent that the Y-branch is a loop-terminal conditional branch,whereas the Skip/Complex situation is indeterminate:a backwards branch is encountered,so a loop potentially exists in the extra instructions executed.Unfortunately,it is not clear from our analysis whether the Y-branch itlf was the loop-terminal branch.

But not all cas are so clean as the categorizations of Skip/Simple,Skip/Loop,Skip/Complex,and Disjoint.Occasionally,the region between the injection and full state reconvergence is too large for our tools to analyze.For cas where this region is over 1M instructions,we categorize it as Trace Too Big .Furthermore,cas ari where the Y-branch injection has lingering effects on pro-gram state,causing control to randomly waver before ﬁ-nally converging.When this occurs,we ver the traces at the ﬁrst common instruction and attempt to classify it as described above.The cas are called Skip/Loop +Slop ,Skip/Simple +Slop ,etc.In certain cas it is difﬁcult to ﬁnd a point at which t

o truncate so we give up (Unclassiﬁed ).

Figure 8provides a detailed classiﬁcation of Y-branches that ari from mispredicted branches.From this ﬁgure,it is possible to identify the approximate con-tribution to Y-branches of each coding construct.On average,the contributions to the various coding con-structs is about equal bad on our categorization.Ap-proximately 10%of mispredicted branches are outcome-tolerant and due to short-circuit like code constructs

(Skip/Simple and Skip/Simple+Slop).Approximately 20%of mispredictions are Y-branches that involve loops (Skip/Loop,Skip/Complex,...).Approximately 15%of them involve if-el -type constructs (Skip/Disjoint,Skip/Disjoint+Slop).

3.4.Speciﬁc examples

To further illustrate the nature of the outcome-tolerance phenomenon,we examine speciﬁc examples,taken from SPEC,of three coding constructs that lead to instances of outcome-tolerance.

3.4.1.Short-circuit example.An example of a short-circuit code construct from gzip is shown in Figure 9.The if statement is within a do-while loop,and it tests four conditions.If any of them evaluate to true,

the rest of the current iteration of the loop is skipped by the continue statement.In the binary,the compiler implements this if statement as four conditional branches,each of which tests one of the conditions and,if true,branches to the bottom of the loop.We obrve that the ﬁrst branch accounts for about 15%of all mispredictions in gzip and about 90%of the are outcome-tolerant.In general,the full state and control reconvergence latencies tend to be short for short-circuit like code constructs,and the difference in numbers of instructions executed tends to be small.

3.4.2.Early loop exit examples.Bzip2is a ﬁle compres-sion program that employs move-to-front coding after ap-plying a transformation on the data to be compresd [4].When an element is ud during encoding or decoding,it is moved to the front of a list,shifting all previous ele-ments down by one entry.The shifting of elements is im-plemented in a tight loop,which is unrolled four times in the source code.The source code is shown in Figure 10.

本文发布于:2023-05-22 03:53:43，感谢您对本站的认可！

本文链接：https://www.wtabcd.cn/fanwen/fan/82/727472.html

上一篇：最新部编版《去年的树》教学设计小学语文《去年的树》教学设计(五篇)

下一篇：2023年减少丢弃及重新利用教学反思(4篇)

标签：随想录孩子风险故事巴金可爱

留言与评论（共有 0 条评论）