裝配圖網(wǎng) > 圖紙專區(qū) > 課件教案 > 《數(shù)據(jù)庫系統(tǒng)》教學課件

《數(shù)據(jù)庫系統(tǒng)》教學課件

數(shù)據(jù)庫系統(tǒng)教學課件,數(shù)據(jù)庫系統(tǒng),數(shù)據(jù)庫,系統(tǒng),教學,課件 Implementation of Relational OperationsJianlin FengSchool of SoftwareSUN YAT-SEN UNIVERSITYIntroductionnNext topic:QUERY PROCESSINGnSome database operations are EXPENSIVEnHuge performance gained by being“smart”qWell see 1,000,000 x over nave approachnMain weapons are:qclever implementation techniques for operatorsqexploiting relational algebra“equivalences”qusing statistics and cost models to chooseSimple SQL RefreshernSELECT FROM WHERE SELECT S.name,E.cid FROM Students S,Enrolled E WHERE S.sid=E.sid AND E.grade=AA Really Bad Query OptimizernFor each Select-From-Where query blockqCreate a plan that:nForms the cross product of the FROM clausenApplies the WHERE clausen(Then,as needed:qApply the GROUP BY clauseqApply the HAVING clauseqApply any projections and output expressionsqApply duplicate elimination and/or ORDER BY)predicatestablesCost-based Query Sub-System Query ParserQuery OptimizerPlan GeneratorPlan Cost EstimatorQuery Plan EvaluatorCatalog ManagerSchemaStatisticsSelect*From Blah BWhere B.blah=blahQueriesThe Query Optimization GamenGoal is to pick a“good”planqGood=low expected cost,under cost modelqDegrees of freedom:naccess methodsnphysical operatorsnoperator ordersnRoadmap for this topic:qFirst:implementing individual operatorsqThen:optimizing multiple operatorsRelational OperationsnWe will consider how to implement:qSelection ()Select a subset of rows.qProjection ()Remove unwanted columns.qJoin ()Combine two relations.qSet-difference ()Tuples in reln.1,but not in reln.2.qUnion ()Tuples in reln.1 and in reln.2.nQ:What about Intersection?Schema for ExamplesnSailors:qEach tuple is 50 bytes long,80 tuples per page,500 pages.qS=500,pS=80.nReserves:qEach tuple is 40 bytes,100 tuples per page,1000 pages.qR=1000,pR=100.Sailors(sid:integer,sname:string,rating:integer,age:real)Reserves(sid:integer,bid:integer,day:dates,rname:string)Simple SelectionsnHow best to perform?Depends on:qwhat indexes are availableqexpected size of resultnSize of result approximated as(size of R)*selectivity qselectivity estimated via statistics we will discuss shortly.SELECT *FROM Reserves RWHERE R.rname C%Our options nIf no appropriate index exists:Must scan the whole relationcost=R.For“reserves”=1000 I/Os.nWith index on selection attribute:1.Use index to find qualifying data entries2.Retrieve corresponding data recordsTotal cost=cost of step 1+cost of step 2qFor“reserves”,if selectivity=10%(100 pages,10000 tuples):nIf clustered index,cost is a little over 100 I/Os;nIf unclustered,could be up to 10000 I/Os!unless Our options Index entriesData entriesdirect search for(Index File)(Data file)Data Recordsdata entriesData entriesData RecordsCLUSTEREDUNCLUSTEREDRefinement for unclustered indexes1.Find qualifying data entries.2.Sort the rids of the data records to be retrieved.3.Fetch rids in order.Each data page is looked at just once(though#of such pages likely to be higher than with clustering).(Index File)(Data file)Data entriesData RecordsUNCLUSTEREDGeneral Selection ConditionsnFirst,convert to conjunctive normal form(CNF):q(day8/9/94 OR bid=5 OR sid=3)AND (rname=Paul OR bid=5 OR sid=3)nWe only discuss the case with no ORsnTerminology:qA B-tree index matches terms that involve only attributes in a prefix of the search key.e.g.:qIndex on matches a=5 AND b=3,but not b=3.*(day8/9/94 AND rname=Paul)OR bid=5 OR sid=32 Approaches to General SelectionsApproach I:1.Find the cheapest access path2.retrieve tuples using it3.Apply any remaining terms that dont match the indexqCheapest access path:An index or file scan that we estimate will require the fewest page I/Os.Cheapest Access Path-Examplequery:day 8/9/94 AND bid=5 AND sid=3some options:B+tree index on day;check bid=5 and sid=3 afterward.hash index on;check day8/9/94 afterward.nHow about a B+tree on?nHow about a B+tree on?nHow about a Hash index on?Approach II:use 2 or more matching indexes.1.From each index,get set of rids2.Compute intersection of rid sets3.Retrieve records for rids in intersection4.Apply any remaining termsEXAMPLE:day8/9/94 AND bid=5 AND sid=3Suppose we have an index on day,and another index on sid.qGet rids of records satisfying day8/9/94.qAlso get rids of records satisfying sid=3.qFind intersection,then retrieve records,then check bid=5.2 Approaches to General Selections(Contd.)ProjectionnIssue is removing duplicates.nUse sorting!1.Scan R,extract only the needed attributes2.Sort the resulting set3.Remove adjacent duplicatesCost:Ramakrishnan/Gehrke writes to temp table at each step!Reserves with size ratio 0.25=250 pages.With 20 buffer pages can sort in 2 passes,so:1000+250+2*2*250+250=2500 I/OsSELECT DISTINCT R.sid,R.bidFROM Reserves RProjection-improvednAvoid the temp files,work on the fly:qModify Pass 0 of sort to eliminate unwanted fields.qModify Passes 1+to eliminate duplicates.Cost:Reserves with size ratio 0.25=250 pages.With 20 buffer pages can sort in 2 passes,so:1.Read 1000 pages2.Write 250(in runs of 40 pages each)=7 runs3.Read and merge runs(20 buffers,so 1 merge pass!)Total cost=1000+250+250=1500.Other Projection TricksIf an index search key contains all wanted attrs:nDo index-only scanqApply projection techniques to data entries(much smaller!)If a B+Tree index search key prefix has all wanted attrs:nDo in-order index-only scanqCompare adjacent tuples on the fly(no sorting required!)JoinsnJoins are very common.nR S is large;so,R S followed by a selection is inefficient.nMany approaches to reduce join cost.nJoin techniques we will cover today:1.Nested-loops join2.Index-nested loops join3.Sort-merge joinSELECT *FROM Reserves R1,Sailors S1WHERE R1.sid=S1.sidR S:foreach tuple r in R doforeach tuple s in S doif ri=sj then add to resultSimple Nested Loops JoinCost=(pR*R)*S+R=100*1000*500+1000 IOsqAt 10ms/IO,Total time:?nWhat if smaller relation(S)was“outer”?nWhat assumptions are being made here?nWhat is cost if one relation can fit entirely in memory?R S:Page-Oriented Nested Loops JoinCost=R*S+R=1000*500+1000nIf smaller relation(S)is outer,cost=500*1000+500nMuch better than nave per-tuple approach!foreach page bR in R do foreach page bS in S do foreach tuple r in bR doforeach tuple s in bSdoif ri=sj then add to resultBlock Nested Loops JoinnPage-oriented NL doesnt exploit extra buffers:(nIdea to use memory efficiently:.R&Sblock of R tuples(B-2 pages)Input buffer for SOutput buffer.Join ResultCost:Scan outer+(#outer blocks*scan inner)#outer blocks=Examples of Block Nested Loops JoinnSay we have B=100+2 memory buffersnJoin cost=outer+(#outer blocks*inner)#outer blocks=outer/100nWith R as outer(R=1000):qScanning R costs 1000 IOs(done in 10 blocks)qPer block of R,we scan S;costs 10*500 I/OsqTotal=1000+10*500.nWith S as outer(S=500):qScanning S costs 500 IOs(done in 5 blocks)qPer block of S,we can R;costs 5*1000 IOsqTotal=500+5*1000.R S:Index Nested Loops Joinforeach tuple r in R doforeach tuple s in S where ri=sj doadd to resultData entriesData RecordsINDEX on Slookup(ri)sjS:R:riR S:Cost=R+(R*pR)*cost to find matching S tuples If index uses Alt.1,cost=cost to traverse tree from root to leaf.For Alt.2 or 3:1.Cost to lookup RID(s);typically 2-4 IOs for B+Tree.2.Cost to retrieve records from RID(s);depends on clustering.qClustered index:1 I/O per page of matching S tuples.qUnclustered:up to 1 I/O per matching S tuple.Index Nested Loops Joinforeach tuple r in R doforeach tuple s in S where ri=sj doadd to resultReminder:Schema for ExamplesnSailors:qEach tuple is 50 bytes long,80 tuples per page,500 pages.qS=500,pS=80.nReserves:qEach tuple is 40 bytes,100 tuples per page,1000 pages.qR=1000,pR=100.Sailors(sid:integer,sname:string,rating:integer,age:real)Reserves(sid:integer,bid:integer,day:dates,rname:string)Sort-Merge Join Example:SELECT *FROM Reserves R1,Sailors S1WHERE R1.sid=S1.sid1.Sort R on join attr(s)2.Sort S on join attr(s)3.Scan sorted-R and sorted-S in tandem,to find matchesCost of Sort-Merge JoinnCost:Sort R+Sort S+(R+S)qBut in worst case,last term could be R*S (very unlikely!)qQ:what is worst case?Suppose B=35 buffer pages:nBoth R and S can be sorted in 2 passesnTotal join cost=4*1000+4*500+(1000+500)=7500Suppose B=300 buffer pages:nAgain,both R and S sorted in 2 passesnTotal join cost=7500Block-Nested-Loop cost=2500 15,000Other Considerations 1.An important refinement:Do the join during the final merging pass of sort!If have enough memory,can do:1.Read R and write out sorted runs2.Read S and write out sorted runs3.Merge R-runs and S-runs,while finding R S matchesCost=3*R+3*SQ:how much memory is“enough”(will answer next time)2.Sort-merge join an especially good choice if:one or both inputs are already sorted on join attribute(s)output is required to be sorted on join attributes(s)Q:how to take these savings into account?(stay tuned)SummarynA virtue of relational DBMSs:queries are composed of a few basic operatorsqThe implementation of these operators can be carefully tunednMany alternative implementation techniques for each operatorqNo universally superior technique for most operators.nMust consider available alternativesqCalled“Query optimization”-we will study this topic soon!

數(shù)據(jù)庫系統(tǒng)教學課件.zip

《數(shù)據(jù)庫系統(tǒng)》教學課件

lec 9 Query Optimization.ppt---(點擊預覽)

lec 9 Physical Design.ppt---(點擊預覽)

lec 8 Query Processing.ppt---(點擊預覽)

lec 7 Hash-Based Inds.ppt---(點擊預覽)

lec 6 External Sorting.ppt---(點擊預覽)

lec 5 TreeInds.ppt---(點擊預覽)

lec 4 SQL.ppt---(點擊預覽)

lec 4 calculus.ppt---(點擊預覽)

lec 4 algebra.ppt---(點擊預覽)

lec 3 Storing Data.ppt---(點擊預覽)

lec 3 File and Indexing.ppt---(點擊預覽)

lec 2 The Relational Model.ppt---(點擊預覽)

lec 2 The Entity-Relationship Model.ppt---(點擊預覽)

lec 15 final review.ppt---(點擊預覽)

lec 14 warehouse.ppt---(點擊預覽)

lec 12 recovery.ppt---(點擊預覽)

lec 11 transaction.ppt---(點擊預覽)

lec 10 norm.ppt---(點擊預覽)

lec 1 overview.ppt---(點擊預覽)

壓縮包目錄

預覽區(qū)

《數(shù)據(jù)庫系統(tǒng)》教學課件
- lec 1 overview.ppt--點擊預覽
- lec 10 norm.ppt--點擊預覽
- lec 11 transaction.ppt--點擊預覽
- lec 12 recovery.ppt--點擊預覽
- lec 14 warehouse.ppt--點擊預覽
- lec 15 final review.ppt--點擊預覽
- lec 2 The Entity-Relationship Model.ppt--點擊預覽
- lec 2 The Relational Model.ppt--點擊預覽
- lec 3 File and Indexing.ppt--點擊預覽
- lec 3 Storing Data.ppt--點擊預覽
- lec 4 algebra.ppt--點擊預覽
- lec 4 calculus.ppt--點擊預覽
- lec 4 SQL.ppt--點擊預覽
- lec 5 TreeInds.ppt--點擊預覽
- lec 6 External Sorting.ppt--點擊預覽
- lec 7 Hash-Based Inds.ppt--點擊預覽
- lec 8 Query Processing.ppt--點擊預覽
- lec 9 Physical Design.ppt--點擊預覽
- lec 9 Query Optimization.ppt--點擊預覽

請點擊導航文件預覽

編號：48634128 類型：共享資源大小：6.17MB 格式：ZIP 上傳時間：2022-01-12

30
積分

舉報

版權申訴 word格式文檔無特別注明外均可編輯修改；預覽文檔經(jīng)過壓縮，下載后原文更清晰！ 立即下載

關鍵詞：: 數(shù)據(jù)庫系統(tǒng) 數(shù)據(jù)庫系統(tǒng) 教學課件

資源描述：: 《數(shù)據(jù)庫系統(tǒng)》教學課件,數(shù)據(jù)庫系統(tǒng),數(shù)據(jù)庫,系統(tǒng),教學,課件

展開閱讀全文

溫馨提示:
1: 本站所有資源如無特殊說明，都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
2: 本站的文檔不包含任何第三方提供的附件圖紙等，如果需要附件，請聯(lián)系上傳者。文件的所有權益歸上傳用戶所有。
3.本站RAR壓縮包中若帶圖紙，網(wǎng)頁內(nèi)容里面會有圖紙預覽，若沒有圖紙預覽就沒有圖紙。
4. 未經(jīng)權益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
5. 裝配圖網(wǎng)僅提供信息存儲空間，僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理，對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯，并不能對任何下載內(nèi)容負責。
6. 下載文件中如有侵權或不適當內(nèi)容，請與我們聯(lián)系，我們立即糾正。
7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

裝配圖網(wǎng)所有資源均是用戶自行上傳分享，僅供網(wǎng)友學習交流，未經(jīng)上傳用戶書面授權，請勿作他用。

關于本文

本文標題：《數(shù)據(jù)庫系統(tǒng)》教學課件
鏈接地址：http://m.italysoccerbets.com/article/48634128.html

點擊下載此資源

《數(shù)據(jù)庫系統(tǒng)》教學課件

最新文檔

相關資源

相關搜索