首页 > 美文鉴赏

Assignment6SolutionsQuestions-UbicompLab

更新时间:2023-07-03 10:24:19 阅读：评论：0

Databa Systems

Instructors: Hao-Hua Chu

Winston Hsu

Fall Semester, 2007

Assignment 6 Solutions

关于思念的句子

Questions

I. Consider a disk with an average ek time of 10ms, average rotational delay of

5ms, and a transfer time of 1ms for a 4K page. Assume that the cost of reading/writing a page is the sum of the values (i.e., 16ms) unless a quence of pages is read/written. In this ca, the cost is the average ek time plus the average rotational delay (to find the first page in the quence) plus 1ms per page (to transfer data). You are given 320 buffer pages and asked to sort a file with 10,000,000 pages.

1.Why is it a bad idea to u the 320 pages to support virtual memory, that is, to

new10,000,000 * 4K bytes of memory, and to u an in-memory sorting

algorithm such as Quicksort?

2.Assume that you begin by creating sorted runs of 320 pages each in the first

pass. Evaluate the cost of the following approaches for the subquent电脑如何快速锁屏

merging pass:

(a)Do 319-way merges.

(b)Create 256 ‘input’ buffers of 1 page each, create an ‘output’ buffer of 64

pages, and do 256-way merges.

(c)Create 16 ‘input’ buffers of 16 pages each, create an ‘output’ buffer of 64

pages, and do 16-way merges.

(d)Create eight ‘input’ buffers of 32 pages each, create an ‘output’ buffer of

64 pages, and do eight-way merges.东湖广场

(e)Create four ‘input’ buffers of 64 pages each, create an ‘output’ buffer of

64 pages, and do four-way merges.

Ans:

In Pass 0, 31250 sorted runs of 320 pages each are created. For each run, we read and write 320 pages quentially. The I/O cost per run is

2 ∗(10+5+1∗320) = 670ms. Thus, the I/O cost for Pass 0 is 31250 ∗670 =

20937500ms. For each of the cas discusd below, this cost must be added to the cost of the subquent merging pass to get the total cost. Also, the

calculations below are slightly simplified by neglecting the effect of a final read/written block that is slightly smaller than the earlier blocks.

1.For 319-way merges, only 2 more pass are needed. The first pass will

produce ceiling(31250/319) = 98 sorted runs; the can then be merged in the

next pass. Every page is read and written individually, at a cost of 16ms per

read or write, in each of the two pass. The cost of the merging pass is

therefore 2∗(2∗16)∗10000000 = 640000000ms.

(The formula can be read as ‘number of pass times cost of read and write

per page times number of pages in file’.)

2.With 256-way merges, only two additional merging pass are needed. Every

page in the file is read and written in each pass, but the effect of blocking is

different on reads and writes. For reading, each page is read individually at a

cost of 16ms. Thus, the cost of reads (over both pass) is

2 ∗16 ∗10000000 = 320000000ms. For writing, pages are written out in

blocks of 64 pages. The I/O cost per block is 10 + 5 + 1 ∗ 64 = 79ms. The

number of blocks written out per pass is 10000000/64 = 156250, and the cost

per pass is 156250∗79 = 12343750ms. The cost of writes over both merging

pass is therefore 2 ∗ 12343750 = 24687500ms. The total cost of reads and

writes for the two merging pass is 320000000 + 24687500 = 344687500ms.

3.With 16-way merges, 4 additional merging pass are needed. For reading,

pages are read in blocks of 16 pages, at a cost per block of 10 + 5 + 1 ∗ 16 =

31ms. In each pass, 10000000/16 = 625000 blocks are read. The cost of

reading over the 4 merging pass is therefore 4 ∗ 625000 ∗ 31 = 77500000ms.

For writing, pages are written in 64 page blocks, and the cost per pass is

12343750ms as before. The cost of writes over 4 merging pass is 4 ∗

12343750 = 49375000ms, and the total cost of the merging pass is

77500000+ 49375000 = 126875000ms.

4.With 8-way merges, 5 merging pass are needed. For reading, pages are read

in blocks of 32 pages, at a cost per block of 10 + 5 + 1 ∗ 32 = 47ms. In each

pass, 10000000/32 = 312500 blocks are read. The cost of reading over the 5

merging pass is therefore 5∗ 312500 ∗ 47 = 73437500ms. For writing, pages

are written in 64 page blocks, and the cost per pass is 12343750ms as before.

The cost of writes over 5 merging pass is 5 ∗ 12343750 = 61718750ms, and

the total cost of the merging pass is 73437500+ 61718750 = 135156250ms.

5.With 4-way merges, 8 merging pass are needed. For reading, pages are read

in blocks of 64 pages, at a cost per block of 10 + 5 + 1 ∗ 64 = 79ms. In each

pass, 10000000/64 = 156250 blocks are read. The cost of reading over the 8

merging pass is therefore 8∗ 156250 ∗ 79 = 98750000ms. For writing, pages

are written in 64 page blocks, and the cost per pass is 12343750ms as before.

The cost of writes over 8 merging pass is 8 ∗ 12343750 = 98750000ms, and

茄子的家常做法大全the total cost of the merging pass is 98750000+ 98750000 = 197500000ms. II. Consider the join R∞R.a=S.b S, given the following information about the relations to be joined. The cost metric is the number of page I/Os unless otherwi noted, and the cost of writing out the result should be uniformly ignored.

Relation R contains 10,000 tuples and has 10 tuples per page.

Relation S contains 2000 tuples and also has 10 tuples per page.

Attribute b of relation S is the primary key for S.催化裂解

Both relations are stored as simple heap files.

Neither relation has any indexes built on it.

52 buffer pages are available.

1.What is the cost of joining R and S using a page-oriented simple nested loops

join? What is the minimum number of buffer pages required for this cost to

remain unchanged?

2.What is the cost of joining R and S using a block nested loops join? What is

the minimum number of buffer pages required for this cost to remain

unchanged?

3.What is the cost of joining R and S using a sort-merge join? What is the

minimum number of buffer pages required for this cost to remain unchanged?

4.What is the cost of joining R and S using a hash join? What is the minimum

优秀演讲稿

number of buffer pages required for this cost to remain unchanged?

5.What would be the lowest possible I/O cost for joining R and S using any join

algorithm, and how much buffer space would be needed to achieve this cost?

Explain briefly.

6.How many tuples does the join of R and S produce, at most, and how many

pages are required to store the result of the join back on disk?

7.Would your answers to any of the previous questions in this exerci change if

you were told that R.a is a foreign key that refers to S.b?

Ans: Let M =1000 be the number of pages in R, N = 200 be the number of pages in S, and B = 52 be the number of buffer pages available.

1. Basic idea is to read each page of the outer relation, and for each page scan the

inner relation for matching tuples. Total cost would be

#pagesinouter + (#pagesinouter ∗ #pagesininner)

which is minimized by having the smaller relation be the outer relation.

TotalCost = N + (N ∗ M) = 200200

The minimum number of buffer pages for this cost is 3.

2.This time read the outer relation in blocks, and for each block scan the inner

relation for matching tuples. So the outer relation is still read once, but the inner relation is scanned only once for each outer block, of which there are ceiling ( #pages in outer / (B−2) ) = ceiling (200/50) = 4.

TotalCost = N + M ∗ ceiling( N / (B−2) ) = 4200

If the number of buffer pages is less than 52, the number of scans of the inner would be more than 4 since ceiling(200/49)= 5. The minimum number of buffer pages for this cost is therefore 52.

3.Since B > √M > √N we can u the refinement to Sort-Merge :

海洋经济数值比例尺

TotalCost = 3∗ (M + N) = 3600

NOTE: if S.b were not a key, then the merging pha could require more than one pass over one of the relations, making the cost of merging M∗N I/Os in the worst ca.

The minimum number of buffer pages required is 25. With 25 buffer pages, the initial sorting pass will split R into 20 runs of size 50 and split S into 4 runs of size 50 (approximately). The 24 runs can then be merged in one pass, with one page left over to be ud as an output buffer. With fewer than 25 buffer pages the number of runs produced by the first pass over both relations would exceed the number of available pages, making a one-pass merge impossible.

4.The cost of Hash Join is 3∗(M+N) if B > √(f ∗ N) where f is a fudge factor

ud to capture the small increa in size involved in building a hash table, and N is the number of pages in the smaller relation, S (e page 258). Since √N ≈14, we can assume that this condition is met. We will also assume uniform partitioning from our hash function.

TotalCost = 3∗ (M + N) = 3600

Without knowing f we can only approximate the minimum number of buffer pages required, and a go

od guess is that we need B >√(f ∗ N).

5.The optimal cost would be achieved if each relation was only read once. We

could do such a join by storing the entire smaller relation in memory, reading in the larger relation page-by-page, and for each tuple in the larger relation we arch the smaller relation (which exists entirely in memory) for matching tuples. The buffer pool would have to hold the entire smaller relation, one page for reading in the larger relation, and one page to rve as an output buffer.

TotalCost = M + N = 1, 200

The minimum number of buffer pages for this cost is N + 1 + 1 = 202.

6.Any tuple in R can match at most one tuple in S becau S.b is a primary key

(which means the S.b field contains no duplicates). So the maximum number of tuples in the result is equal to the number of tuples in R, which is 10,000.

The size of a tuple in the result could be as large as the size of an R tuple plus the size of an S tuple (minus the size of the shared attribute). This may allow only 5 tuples to be stored on a page. Storing 10,000 tuples at 5 per page would require 2000 pages in the result.

7.The foreign key constraint tells us that for every R tuple there is exactly one

matching S tuple (becau S.b is a key). The Sort-Merge and Hash Joins would not be affected, but we could reduce the cost of the two Nested Loops joins. If we make R the outer relation then for each tuple of R we only have to scan S until a match is found. This will require scanning only 50%of S on average.

For Page-Oriented Nested Loops, the new cost would be

TotalCost = M + (M ∗ (N/2)) = 101 000

and 3 buffer pages are still required.

For Block Nested Loops, the new cost would be

TotalCost = M + (N/2) ∗ ceiling( M / (B−2) ) = 3000 and again this cost can only be achieved with 52 available buffer pages.

本文发布于:2023-07-03 10:24:19，感谢您对本站的认可！

本文链接：https://www.wtabcd.cn/fanwen/fan/89/1065827.html

上一篇：六.阅读理解（A）

下一篇：英语作文电子垃圾