Assignment6SolutionsQuestions-UbicompLab

更新时间:2023-07-03 10:24:19 阅读: 评论:0

Databa Systems
Instructors: Hao-Hua Chu
Winston Hsu
Fall Semester, 2007
Assignment 6 Solutions
关于思念的句子
Questions
I. Consider a disk with an average ek time of 10ms, average rotational delay of
5ms, and a transfer time of 1ms for a 4K page. Assume that the cost of reading/writing a page is the sum of the values (i.e., 16ms) unless a quence of  pages is read/written. In this ca, the cost is the average ek time plus the average rotational delay (to find the first page in the quence) plus 1ms per page (to transfer data). You are given 320 buffer pages and asked to sort a file with 10,000,000 pages.
1.Why is it a bad idea to u the 320 pages to support virtual memory, that is, to
new10,000,000 * 4K bytes of memory, and to u an in-memory sorting
algorithm such as Quicksort?
2.Assume that you begin by creating sorted runs of 320 pages each in the first
pass. Evaluate the cost of the following approaches for the subquent电脑如何快速锁屏
merging pass:
(a)Do 319-way merges.
(b)Create 256 ‘input’ buffers of 1 page each, create an ‘output’ buffer of 64
pages, and do 256-way merges.
(c)Create 16 ‘input’ buffers of 16 pages each, create an ‘output’ buffer of 64
pages, and do 16-way merges.
(d)Create eight ‘input’ buffers of 32 pages each, create an ‘output’ buffer of
64 pages, and do eight-way merges.东湖广场
(e)Create four ‘input’ buffers of 64 pages each, create an ‘output’ buffer of
64 pages, and do four-way merges.
Ans:
In Pass 0, 31250 sorted runs of 320 pages each are created. For each run, we read and write 320 pages quentially. The I/O cost per run is
2 ∗(10+5+1∗320) = 670ms. Thus, the I/O cost for Pass 0 is 31250 ∗670 =
20937500ms. For each of the cas discusd below, this cost must be added to the cost of the subquent merging pass to get the total cost. Also, the
calculations below are slightly simplified by neglecting the effect of a final read/written block that is slightly smaller than the earlier blocks.
1.For 319-way merges, only 2 more pass are needed. The first pass will
produce ceiling(31250/319) = 98 sorted runs; the can then be merged in the
next pass. Every page is read and written individually, at a cost of 16ms per
read or write, in each of the two pass. The cost of the merging pass is
therefore 2∗(2∗16)∗10000000 = 640000000ms.
(The formula can be read as ‘number of pass times cost of read and write
per page times number of pages in file’.)
2.With 256-way merges, only two additional merging pass are needed. Every
page in the file is read and written in each pass, but the effect of blocking is
different on reads and writes. For reading, each page is read individually at a
cost of 16ms. Thus, the cost of reads (over both pass) is
2 ∗16 ∗10000000 = 320000000ms. For writing, pages are written out in
blocks of 64 pages. The I/O cost per block is 10 + 5 + 1 ∗ 64 = 79ms. The
number of blocks written out per pass is 10000000/64 = 156250, and the cost
per pass is 156250∗79 = 12343750ms. The cost of writes over both merging
pass is therefore 2 ∗ 12343750 = 24687500ms. The total cost of reads and
writes for the two merging pass is 320000000 + 24687500 = 344687500ms.
3.With 16-way merges, 4 additional merging pass are needed. For reading,
pages are read in blocks of 16 pages, at a cost per block of 10 + 5 + 1 ∗ 16 =
31ms. In each pass, 10000000/16 = 625000 blocks are read. The cost of
reading over the 4 merging pass is therefore 4 ∗ 625000 ∗ 31 = 77500000ms.
For writing, pages are written in 64 page blocks, and the cost per pass is
12343750ms as before. The cost of writes over 4 merging pass is 4 ∗
12343750 = 49375000ms, and the total cost of the merging pass is
77500000+ 49375000 = 126875000ms.
4.With 8-way merges, 5 merging pass are needed. For reading, pages are read
in blocks of 32 pages, at a cost per block of 10 + 5 + 1 ∗ 32 = 47ms. In each
pass, 10000000/32 = 312500 blocks are read. The cost of reading over the 5
merging pass is therefore 5∗ 312500 ∗ 47 = 73437500ms. For writing, pages
are written in 64 page blocks, and the cost per pass is 12343750ms as before.
The cost of writes over 5 merging pass is 5 ∗ 12343750 = 61718750ms, and
the total cost of the merging pass is 73437500+ 61718750 = 135156250ms.
5.With 4-way merges, 8 merging pass are needed. For reading, pages are read
in blocks of 64 pages, at a cost per block of 10 + 5 + 1 ∗ 64 = 79ms. In each
pass, 10000000/64 = 156250 blocks are read. The cost of reading over the 8
merging pass is therefore 8∗ 156250 ∗ 79 = 98750000ms. For writing, pages
are written in 64 page blocks, and the cost per pass is 12343750ms as before.
The cost of writes over 8 merging pass is 8 ∗ 12343750 = 98750000ms, and
茄子的家常做法大全the total cost of the merging pass is 98750000+ 98750000 = 197500000ms. II. Consider the join R∞R.a=S.b S, given the following information about the relations to be joined. The cost metric is the number of page I/Os unless otherwi noted, and the cost of writing out the result should be uniformly ignored.
Relation R contains 10,000 tuples and has 10 tuples per page.
Relation S contains 2000 tuples and also has 10 tuples per page.
Attribute b of relation S is the primary key for S.催化裂解
Both relations are stored as simple heap files.
Neither relation has any indexes built on it.
52 buffer pages are available.
1.What is the cost of joining R and S using a page-oriented simple nested loops
join? What is the minimum number of buffer pages required for this cost to
remain unchanged?
2.What is the cost of joining R and S using a block nested loops join? What is
the minimum number of buffer pages required for this cost to remain
unchanged?
3.What is the cost of joining R and S using a sort-merge join? What is the
minimum number of buffer pages required for this cost to remain unchanged?
4.What is the cost of joining R and S using a hash join? What is the minimum
优秀演讲稿
number of buffer pages required for this cost to remain unchanged?
5.What would be the lowest possible I/O cost for joining R and S using any join
algorithm, and how much buffer space would be needed to achieve this cost?
Explain briefly.
6.How many tuples does the join of R and S produce, at most, and how many
pages are required to store the result of the join back on disk?
7.Would your answers to any of the previous questions in this exerci change if
you were told that R.a is a foreign key that refers to S.b?
Ans: Let M =1000 be the number of pages in R, N = 200 be the number of pages in S, and B = 52 be the number of buffer pages available.
1. Basic idea is to read each page of the outer relation, and for each page scan the
inner relation for matching tuples. Total cost would be
#pagesinouter + (#pagesinouter  ∗  #pagesininner)
which is minimized by having the smaller relation be the outer relation.
TotalCost = N + (N ∗ M) = 200200
The minimum number of buffer pages for this cost is 3.
2.This time read the outer relation in blocks, and for each block scan the inner
relation for matching tuples. So the outer relation is still read once, but the inner relation is scanned only once for each outer block, of which there are ceiling ( #pages in outer / (B−2) )  =  ceiling (200/50) = 4.
TotalCost = N + M ∗ ceiling( N / (B−2) ) = 4200
If the number of buffer pages is less than 52, the number of scans of the inner would be more than 4 since ceiling(200/49)= 5. The minimum number of buffer pages for this cost is therefore 52.
3.Since B > √M > √N we can u the refinement to Sort-Merge :
海洋经济数值比例尺
TotalCost = 3∗ (M + N) = 3600
NOTE: if S.b were not a key, then the merging pha could require more than one pass over  one of the relations, making the cost of merging M∗N I/Os in the worst ca.
The minimum number of buffer pages required is 25. With 25 buffer pages, the initial sorting pass will split R into 20 runs of size 50 and split S into 4 runs of size 50 (approximately). The 24 runs can then be merged in one pass, with one page left over to be ud as an output buffer. With fewer than 25 buffer pages the number of runs produced by the first pass over both relations would exceed the number of available pages, making a one-pass merge impossible.
4.The cost of Hash Join is 3∗(M+N) if B > √(f ∗ N) where f is a fudge factor
ud to capture the small increa in size involved in building a hash table, and N is the number of pages in the smaller relation, S (e page 258). Since √N ≈14, we can assume that this condition is met. We will also assume uniform partitioning from our hash function.
TotalCost = 3∗ (M + N) = 3600
Without knowing f we can only approximate the minimum number of buffer pages required, and a go
od guess is that we need B >√(f ∗ N).
5.The optimal cost would be achieved if each relation was only read once. We
could do such a join by storing the entire smaller relation in memory, reading in the larger relation page-by-page, and for each tuple in the larger relation we arch the smaller relation (which exists entirely in memory) for matching tuples. The buffer pool would have to hold the entire smaller relation, one page for reading in the larger relation, and one page to rve as an output buffer.
TotalCost = M + N = 1, 200
The minimum number of buffer pages for this cost is N + 1 + 1 = 202.
6.Any tuple in R can match at most one tuple in S becau S.b is a primary key
(which means the S.b field contains no duplicates). So the maximum number of tuples in the result is equal to the number of tuples in R, which is 10,000.
The size of a tuple in the result could be as large as the size of an R tuple plus the size of an S tuple (minus the size of the shared attribute). This may allow only 5 tuples to be stored on a page. Storing 10,000 tuples at 5 per page would require 2000 pages in the result.
7.The foreign key constraint tells us that for every R tuple there is exactly one
matching S tuple (becau S.b is a key). The Sort-Merge and Hash Joins would not be affected, but we could reduce the cost of the two Nested Loops joins. If we make R the outer relation then for each tuple of R we only have to scan S until a match is found. This will require scanning only 50%of S on average.
For Page-Oriented Nested Loops, the new cost would be
TotalCost = M + (M ∗ (N/2)) = 101 000
and 3 buffer pages are still required.
For Block Nested Loops, the new cost would be
TotalCost = M + (N/2) ∗ ceiling( M / (B−2) ) = 3000 and again this cost can only be achieved with 52 available buffer pages.

本文发布于:2023-07-03 10:24:19,感谢您对本站的认可!

本文链接:https://www.wtabcd.cn/fanwen/fan/89/1065827.html

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。

标签:茄子   思念   广场   裂解   经济   家常   句子
相关文章
留言与评论(共有 0 条评论)
   
验证码:
推荐文章
排行榜
Copyright ©2019-2022 Comsenz Inc.Powered by © 专利检索| 网站地图