首页 > 美文阅读

并行计算与并发计算 Parallelism and Concurrency

更新时间:2023-05-27 15:20:41 阅读：评论：0

A Sophomoric∗Introduction to

Shared-Memory Parallelism and Concurrency

Dan Grossman

Version of February23,2012

Contents

1Meta-Introduction:An Instructor’s View of The Notes2

1.1Where This Material Fits in a Changing Curriculum (2)

1.2Six Thes On A Successful Approach to this Material (3)

1.3How to U The Notes—And Improve Them (3)

1.4Acknowledgments (4)

2Introduction4

2.1More Than One Thing At Once (4)

2.2Parallelism vs.Concurrency (5)

2.3Basic Threads and Shared Memory (6)

2.4Other Models (9)

3Basic Fork-Join Parallelism10

3.1A Simple Example:Okay Idea,Inferior Style (10)

3.2Why Not To U One Thread Per Processor (14)

3.3Divide-And-Conquer Parallelism (15)

3.4The Java ForkJoin Framework (19)

3.5Reductions and Maps (22)

3.6Data Structures Besides Arrays (24)

4Analyzing Fork-Join Algorithms24

4.1Work and Span (25)

4.2Amdahl’s Law (27)

4.3Comparing Amdahl’s Law and Moore’s Law (28)

个性游戏名字5Fancier Fork-Join Algorithms:Preﬁx,Pack,Sort28

5.1Parallel-Preﬁx Sum (29)

5.2Pack (30)

5.3Parallel Quicksort (32)

5.4Parallel Mergesort (33)

∗As in intended for cond-year students,not as in immaturely pretentious

6Basic Shared-Memory Concurrency34

6.1The Programming Model (35)

6.2The Need for Synchronization (36)

6.3Locks (39)

6.4Locks in Java (40)

7Race Conditions:Bad Interleavings and Data Races43

7.1Bad Interleavings:An Example with Stacks (44)

7.2Data Races:Wrong Even When They Look Right (47)

8Concurrency Programming Guidelines50

8.1Conceptually Splitting Memory in Three Parts (50)

8.2Approaches to Synchronization (52)

9Deadlock56 10Additional Synchronization Primitives59

10.1Reader/Writer Locks (59)

10.2Condition Variables (60)

10.3Other (65)

1Meta-Introduction:An Instructor’s View of The Notes

1.1Where This Material Fits in a Changing Curriculum

The notes teach parallelism and concurrency as part of an advanced sophomore-level data-structures cour –the cour that covers asymptotic complexity,balanced trees,hash tables,graph algorithms,sorting,etc. Why parallelism and concurrency should be taught early:

Parallelism and concurrency are increasingly important topics in computer science and engineering.Traditionally, most undergraduates learned rather little about the topics and did so rather late in the curriculum:Senior-level operating-systems cours cover threads,scheduling,and synchronization.Early hardware cours have circuits and functional units with parallel parts.Advanc

ed architecture cours discuss cache coherence.Electives might cover parallel algorithms or u distributed computing to solve embarrassingly parallel tasks.

Little of this scattered material emphasizes the esntial concepts of parallelism and concurrency—and certainly not in a central place such that subquent cours can rely on it.The days,most desktop and laptop computers have multiple cores.Modern high-level languages have threads built into them and standard libraries u threads (e.g.,Java’s Swing library for GUIs).It no longer ems reasonable to bring up threads“as needed”or delay until an operating-systems cour that should be focusing on operating systems.There is no reason to introduce threadsﬁrst in C or asmbly when all the usual conveniences of high-level languages for introducing core concepts apply.

Why parallelism and concurrency should not be taught too early:

Converly,it is tempting to introduce threads“from day one”in introductory programming cours before stu-dents learn“quential habits.”I suspect this approach is infeasible in most curricula.For example,it may make little n in programs where most students in introductory cours do not end up majoring in computer science.“Messing with intro”is a high-risk endeavor,and introductory cours are already packed with esntial concepts like variables, functions,arrays,linked structures,etc.There is probably no room.

So put it in the data-structures cour:

There may be multiple natural places to introduce parallelism and concurrency,but I claim“sophomore-level”data structures(after CS2and discrete math,but before“nior-level”algorithms)works very well.Here are some reasons:

•There is room:We made room by removing three weeks on skew heaps,leftist heaps,binomial queues,splay trees,disjoint-t,and networkﬂow.Some of the trimming was painful and will be compensated for in nior-level algorithms,but all this material ems relatively less important.There was still plenty of room for esntial data structures and related concepts such as asymptotic analysis,(simple)amortization,graphs,and sorting.

•Fork-join parallel algorithms are amenable to asymptotic analysis in terms of“work”and“span”over dags—all concepts thatﬁt very naturally in the cour.Amdahl’s Law is fundamentally an asymptotic argument too.

•Ideas from quential algorithms already in the cour reappear with parallelism.For example,just a

显性遗传病s constant factors compel efﬁcient quicksort implementations to switch to an O(n2)sort for small n,constant factors compel efﬁcient parallel algorithms to switch to quential algorithms for small problem sizes.Parallel sorting algorithms are also good examples of non-trivial parallelization.

•Many canonical examples for concurrency involve basic data structures:bounded buffers for condition variables, dictionaries for reader/writer locks,parallel unstructured graph traversals,etc.Making a data structure“thread-safe”is an ideal way to think about what it means to be“thread-safe.”

鸽子怎么炖•We already ud Java in the cour.Java7’s ForkJoin framework is excellent for teaching fork-join parallelism.

Java’s built-in support for threads,locks,and condition variables is sufﬁcient for teaching concurrency.

On the other hand,instructors wishing to introduce message passing or distributed computation will have to consider whether it makes n in this cour.I focus on shared memory,only mentioning other models.I do not claim shared memory is“better”(that’s an endless argument on which I have noﬁrm opinion),only that it is an important model and a good one to start with pedagogically.I also do not emphasize asynchrony and masking I/O latency.Such topics are probably better covered in a co

回龙山ur on systems programming,though one could add them to the notes.

1.2Six Thes On A Successful Approach to this Material

In summary,the notes rest on veral thes for how to teach this material:

1.Integrate it into a data-structures cour.孕妇能喝可乐吗

2.Distinguish parallelism(using extra computational units to do more work per unit time)from concurrency

(managing access to shared resources).Teach parallelismﬁrst becau it is easier and helps establish a non-quential mindt.

3.Teach in a high-level language,using a library for fork-join parallelism.Teach how to u parallelism,threads,

locks,etc.Do not teach how to implement them.

4.Converly,do not teach in terms of higher-order parallel patterns like maps and reduces.Mention the,but

have students actually do the divide-and-conquer underlying the patterns.

5.Assume shared memory since one programming model is hard enough.

6.Given the limited time and student background,do not focus on memory-hierarchy ,caching),much

like the issues are ,with B-trees)but rarely central in data-structures cours.(Adding a discussion should prove straightforward.)

If you strongly disagree with the thes,you will probably not like the notes—but you might still try them.

1.3How to U The Notes—And Improve Them

The notes were originally written for CSE332at the University of Washington

(www.cs.washington.edu/education/cours/c332).They account for3weeks of a required10-week cour (the University us a quarter system).Alongside the notes are PowerPoint slides,homework assignments,and a

programming project.In fact,the notes were the last aspect to be written—theﬁrst edition of the cour went great without them and students reported parallelism to be their favorite aspect of the cour.

Surely the notes have errors and explanations could be improved.Plea let me know of any problems youﬁnd.

I am a perfectionist:if youﬁnd a typo I would like to know.Ideally you wouldﬁrst check that the most recent version of the notes does not already have the problemﬁxed.

As for permission,do with the notes whatever you like.Seriously:“steal the notes.”The L A T E X sources are available so you can modify them however you like.I would like to know if you are using the notes and how.It motivates me to improve them and,frankly,it’s not bad for my ego.Constructive criticism is also welcome.That said, I don’t expect any thoughtful instructor to agree completely with me on what to cover and how to cover it.

Contact me at the email djg and then the at-sign and then cs.washington.edu.The current home for t

红乳

he notes and related materials is www.cs.washington.edu/homes/djg/teachingMaterials.Students are more than welcome to contact me:who better to let me know where the notes could be improved.

1.4Acknowledgments

I derve no credit for the material in the notes.If anything,my role was simply to distill decades of wisdom from others down to three weeks of core concepts and integrate the result into a data-structures cour.When in doubt,I stuck with the basic and simplest topics and examples.

I was particularly inﬂuenced by Guy Blelloch and Charles Leirn in terms of teaching parallelism before con-currency and emphasizing divide-and-conquer algorithms that do not consider the number of processors.Doug Lea and other developers of Java’s ForkJoin framework provided a wonderful library that,with some hand-holding,is usable by sophomores.Larry Snyder was also an excellent resource for parallel algorithms.

The treatment of shared-memory synchronization is heavily inﬂuenced by decades of operating-systems cours, but with the distinction of ignoring all issues of scheduling and synchronization implementation.Moreover,the em-phasis on the need to avoid data races in high-level languages is fr

ustratingly under-appreciated despite the noble work of memory-model experts such as Sarita Adve,Hans Boehm,and Bill Pugh.

Feedback from Ruth Anderson,Kim Bruce,Kristian Lieberg,Tyler Robison,Cody Schroeder,and Martin Tompa helped improve explanations and remove typos.Tyler and Martin derve particular mention for using the notes when they were very new.

I have had enlightening and enjoyable discussions on“how to teach this stuff”with too many rearchers and educators over the last few years to list them all,but I am grateful.

This work was funded in part via grants from the National Science Foundation and generous support,ﬁnancial and otherwi,from Intel.

2Introduction

2.1More Than One Thing At Once

In quential programming,one thing happens at a time.Sequential programming is what most people learnﬁrst and how most programs are written.Probably every program you have written in Java is quential:Execution starts at the beginning of main and proceeds one assignment/call/retur

n/arithmetic operation at a time.

Removing the one-thing-at-a-time assumption complicates writing software.The multiple threads of execution (things performing computations)will somehow need to coordinate so that they can work together to complete a task —or at least not get in each other’s way while they are doing parate things.The notes cover basic concepts related to multithreaded ,programs where there are multiple threads of execution.We will cover:

•How to create multiple threads

•How to write and analyze divide-and-conquer algorithms that u threads to produce results more quickly

党员汇报•How to coordinate access to shared objects so that multiple threads using the same data do not produce the wrong answer

曼娜回忆录原文A uful analogy is with cooking.A quential program is like having one cook who does each step of a recipe in order,ﬁnishing one step before starting the next.Often there are multiple steps that could

be done at the same time—if you had more cooks.But having more cooks requires extra coordination.One cook may have to wait for another cook toﬁnish something.And there are limited resources:If you have only one oven,two cooks won’t be able to bake casroles at different temperatures at the same time.In short,multiple cooks prent efﬁciency opportunities,but also signiﬁcantly complicate the process of producing a meal.

Becau multithreaded programming is so much more difﬁcult,it is best to avoid it if you can.For most of computing’s history,most programmers wrote only quential programs.Notable exceptions were:

•Programmers writing programs to solve such computationally large problems that it would take years or cen-turies for one computer toﬁnish.So they would u multiple computers together.

•Programmers writing systems like an operating system where a key point of the system is to handle multiple things happening at once.For example,you can have more than one program running at a time.If you have only one processor,only one program can actually run at a time,but the operating system still us threads to keep track of all the running programs and let them take turns.If the taking turns happens fast ,10 milliconds),humans fall for the illusion of simultaneous execution.This is called time-slicing.

Sequential programmers were lucky:since every2years or so computers got roughly twice as fast,most programs would get exponentially faster over time without any extra effort.

Around2005,computers stopped getting twice as fast every2years.To understand why requires a cour in computer architecture.In brief,increasing the clock rate(very roughly and technically inaccurately speaking,how quickly instructions execute)became infeasible without generating too much heat.Also,the relative cost of memory access can become too high for faster processors to help.

Nonetheless,chip manufacturers still plan to make exponentially more powerful chips.Instead of one processor running faster,they will have more processors.The next computer you buy will likely have4processors(also called cores)on the same chip and the number of available cores will likely double every few years.

What would256cores be good for?Well,you can run multiple programs at once—for real,not just with time-slicing.But for an individual program to run any faster than with one core,it will need to do more than one thing at once.This is the reason that multithreaded programming is becoming more important.To be clear,multithreaded programming is not new.It has existed for decades and all the ke

y concepts the notes cover are just as old.Before there were multiple cores on one chip,you could u multiple chips and/or u time-slicing on one chip—and both remain important techniques today.The move to multiple cores on one chip is“just”having the effect of making multithreading something that more and more software wants to do.

2.2Parallelism vs.Concurrency

The notes are organized around a fundamental distinction between parallelism and concurrency.Unfortunately,the way we deﬁne the terms is not entirely standard,so you should not assume that everyone us the terms as we will.Nonetheless,most computer scientists agree that this distinction is important.

Parallel programming is about using additional computational resources to produce an answer faster.

As a canonical example,consider the trivial problem of summing up all the numbers in an array.We know no quential algorithm can do better thanΘ(n)time.Suppo instead we had4processors.Then hopefully we could produce the result roughly4times faster by having each processor add1/4of the elements and then we could just add the4partial results together with3more additions.Θ(n/4)is still

Θ(n),but constant factors can matter.Moreover, when designing and analyzing a parallel algorithm,we should leave the number of processors as a variable,call it P. Perhaps we can sum the elements of an array in time O(n/P)given n processors.As we will e,in fact the best bound under the assumptions we will make is O(log n+n/P).

In terms of our cooking analogy,parallelism is about using extra cooks(or utensils or pans or whatever)to get a large mealﬁnished in less time.If you have a huge number of potatoes to slice,having more knives and people is really helpful,but at some point adding more people stops helping becau of all the communicating and coordinating

本文发布于:2023-05-27 15:20:41，感谢您对本站的认可！

本文链接：https://www.wtabcd.cn/fanwen/fan/82/793090.html

上一篇：高校处级干部述职报告

下一篇：《最痛之吻》阅读以及答案

标签：游戏遗传病回忆录名字显性

留言与评论（共有 0 条评论）