SQL Server 2005 的商業(yè)智能和數(shù)據(jù)存儲畢業(yè)論文外文翻譯
《SQL Server 2005 的商業(yè)智能和數(shù)據(jù)存儲畢業(yè)論文外文翻譯》由會員分享,可在線閱讀,更多相關《SQL Server 2005 的商業(yè)智能和數(shù)據(jù)存儲畢業(yè)論文外文翻譯(10頁珍藏版)》請在裝配圖網(wǎng)上搜索。
1、 Business Intelligence and Data Warehousing in SQL Server 20051.Introduction Microsoft SQL Server 2005 is a complete business intelligence (BI) platform that provides the features, tools, and functionality to build both classic and innovative kinds of analytical applications. This paper provides an
2、introduction to the tools that you will use to build an analytical application, and highlights new functionality that makes it easier than ever to build and manage complex BI systems.The following table presents an overview of the components of a business intelligence system, and the corresponding M
3、icrosoft SQL Server 2000 and SQL Server 2005 components. Two components are new for SQL Server 2005: SQL Server Management Studio and SQL Server Business Intelligence Development Studio. The other primary BI components Integration Services, Analysis Services OLAP, Analysis Services Data Mining, and
4、Reporting Services are substantially different and improved in SQL Server 2005. The SQL Server 2005 relational database contains several significant new features. Although the Microsoft Office query and portal tools are not part of SQL Server, the current releases will continue to work with SQL Serv
5、er 2005. The BI functionality in Microsoft Office will evolve with the Office product release cycle.2.Where do I start with SQL Server 2005? The first thing youll notice upon installing SQL Server 2005 is that the installation experience is integrated. No longer do you need to run the installation p
6、rogram separately for some features, such as Analysis Services. If a feature such as Reporting Services is not available for installation, your computer may not meet the installation requirements for that feature. You can review the Readme file for a complete discussion of feature prerequisites. The
7、 installation process will install: SQL Server relational database engine Integration Services Analysis Services Reporting Services SQL Server Management Studio (database management toolset) Business Intelligence Development Studio (BI application development toolset) Reporting Services requires tha
8、t IIS be installed and correctly configured. We strongly recommend that you take the time to configure and install IIS, as Reporting Services is an integral part of the SQL Server 2005 Business Intelligence feature set. Customers who are familiar with Analysis Services may be surprised by the lack o
9、f an Analysis Services metadata repository. In SQL Server 2000 the Analysis Services repository was shipped as a Microsoft Access database, but Analysis Services 2005 does not contain a metadata repository. Instead, the Analysis Services database metadata information are stored as XML files and are
10、managed by Analysis Services. These XML files can be placed under source control, if desired. We recommend that you use the Business Intelligence Development Studio to develop and the SQL Server Management Studio to operate and maintain BI database objects. You can design Integration Services packag
11、es and Analysis Services cubes and mining models in the SQL Server Management Studio, but the Business Intelligence Development Studio offers a better experience for designing and debugging BI applications. You will probably learn more by beginning with a new application rather than upgrading existi
12、ng DTS packages and Analysis Services databases. You may find it useful to recreate an existing package or database, if you have one available. After you have become more familiar with the new tools, features, and concepts, it is easier to upgrade existing objects.Many customers will use the SQL Ser
13、ver tools to develop a system with the now-familiar business intelligence structure of one or more source systems using Integration Services to feed a dimensional relational data warehouse, which in turn is used to populate the Analysis Services database. However, SQL Server 2005 provides many optio
14、ns to deviate from this generic design, by eliminating or virtualizing different components.3.Relational data warehousing The SQL Server 2005 relational database engine contains several features of great interest for the design and maintenance of data warehouse style applications. These features inc
15、lude: Table partitions enable fast data load and simplified maintenance for very large tables.Easy creation of a reporting server Transact-SQL improvements including new data types and new analytic functionsOnline index operations Fine grained backup/restore operations Fast file initialization4.Repo
16、rting Server A common technique for offloading relational operational reporting from the transaction database is to maintain a reporting server. The reporting server maintains an image of the transaction database with some latency, most often of the previous day. The reporting server is used for mos
17、t reporting and data warehouse extracts. Microsoft SQL Server 2005 adds two new features, database mirroring and database snapshots, that make it very easy to create and maintain a reporting server. Now a SQL Server reporting server can have much lower latency than once a day. Also, the reporting se
18、rver is designed to act as a standby system for the transaction system.To create a reporting server, first create a database mirror, a new SQL Server 2005 feature that provides an instant standby system for high availability. You can read the SQL Server Books Online topic Database Mirroring Concepts
19、 for more information. The database mirror cannot be queried directly, which is where the second new feature, database snapshots, becomes important. Creating a database snapshot on the mirror provides an additional copy of the data for reporting purposes. A database snapshot is a read-only copy of t
20、he database at a particular point in time, and is not automatically updated with new information when the source database changes. Database snapshots are a multifaceted topic with numerous usage scenarios; you can read the SQL Server Books Online topic Understanding Database Views for more informati
21、on. For now, however, it is enough to note that the way snapshots store their information is extremely space efficient. Multiple database snapshots can exist to provide a full view of the data for reporting purposes, although maintaining a database snapshot does have some impact on the transaction d
22、atabase upon which the database snapshots is based. By creating a database snapshot on a database mirror, you effectively create a standby server for high system availability. The database snapshot can then serve double duty as a reporting server, as well as be used in high availability solutions.5.
23、Table partitions Partitioned tables and indexes have their data divided into horizontal units, so that groups of rows are mapped into individual partitions. Operations performed on the data, such as queries, are executed as if the entire table or index is a single entity.Partitioning can: Improve ta
24、ble and index manageability. Improve query performance on multiple-CPU machines. In a relational data warehouse, fact tables are the obvious candidate for table partitioning and partitioning by date range is the most common partitioning strategy. There are three steps to defining a partitioned table
25、, as described in the SQL Server Books Online topic “Creating Partitioned Tables and Indexes”. The three steps are as follows: Create a partition function specifying how a table that uses the function is partitioned.Create a partition scheme specifying how the partitions of the partition function ar
26、e placed on the filegroup. Create a table or index using the partition scheme. Multiple tables can use the same partition scheme. This paper discusses Range partitioning of fact tables and is not intended to be a complete discussion or tutorial for table partitioning. For more information about tabl
27、e partitioning see SQL Server Books Online. The most common partitioning scheme partitions the fact table by date range, such as, year, quarter, month, or even day. In most scenarios, date partitioning of the large fact table, or tables, provides the greatest manageability benefits. In order to get
28、improved query performance, the Time dimension table should be partitioned using the same partitioning scheme. A partitioned table behaves like an unpartitioned table.Queries to the table are resolved correctly. Direct inserts, updates, and deletes on the table are automatically resolved to the corr
29、ect partition or partitions.6.Using table partitions for fast data loads Most data warehouse applications struggle to load increasingly large volumes of data in a small and shrinking load window. The typical process begins with extracts from several source systems, followed by steps to clean, transf
30、orm, synthesize, and rationalize the data across these systems. The data management application is constrained to complete the entire extract, transformation, and loading process within the load window. Usually, the systems business users have a strong requirement that the system minimize the amount
31、 of time the data warehouse is unavailable for their queries. The write step of the data management application, in which the new data is inserted into the existing data warehouse, must be designed to occur quickly and with minimum user impact. In order to load data very fast, the database recovery
32、model must be either Bulk Logged or Simple, and the table must either be empty or contain data but no indexes. If these conditions are met, a non-logged load is possible. In SQL Server 2000, before partitioned tables existed, these conditions are typically met only in the initial historical data war
33、ehouse load. Some customers with large data warehouses have built a quasi-partitioned structure by constructing a UNION ALL view over separate physical tables; these tables were populated each load cycle using a non-logged technique. This approach was not entirely satisfactory, and SQL Server 2005 p
34、artitioned tables provide superior functionality. In SQL Server 2005, you cannot perform a non-logged load directly into a partition. However, you can load into a separate table that we will call the pseudo-partition. Under certain conditions, you can switch the pseudo-partition into the partitioned
35、 table as a metadata operation that occurs extremely quickly. This technique meets our two requirements of minimizing overall load time: the pseudo-partition load is performed without logging, and minimizing end user impact and ensuring data warehouse integrity: the pseudo-partitions can be loaded w
36、hile users are querying the data warehouse. The data management application can wait until all fact tables are loaded and ready before performing the partition switch. The partition switch operation occurs very quickly, on the order of sub-second response. In addition, the pseudo partition can be ba
37、cked up as a separate table, improving system manageability.7.Using table partitions for fast data deletes Many data warehouses keep a sliding window of detailed data active in the data warehouse. For example, the fact table may contain data for three, five, or ten years. Periodically, the oldest da
38、ta is removed from the table. The primary reasons for keeping data pruned are to improve query performance and minimize storage costs.SQL Server 2005 partitions make it very easy to prune old data from a large partitioned fact table. Simply create an empty pseudo-partition as described above, and th
39、en switch it into the partitioned table. The partitioned table has an empty partition where it once had a populated partition; the pseudo-partition has data where once it was empty. The pseudo-partition can be backed up, truncated, or dropped, as appropriate.Optionally, you may choose to redefine th
40、e partition function to merge all of the left-hand (empty) partitions together into one. SQL Server 2005 的商業(yè)智能和數(shù)據(jù)存儲1、簡介Microsoft SQL Server2005是一個完整的商業(yè)智能(BI)平臺,它提供工具和功能來建立兼具經(jīng)典和創(chuàng)新的各種分析應用。本文將介紹,你將用它來構建分析應用程序的工具,并會強調新的功能,使得它比以往更容易建立和管理復雜的BI系統(tǒng)。下表介紹了商業(yè)智能系統(tǒng)的組件的概述,以及相應的Microsoft SQL Server 2000和SQL Server
41、2005組件。兩個組件是新的SQL Server 2005:SQL Server管理Studio和SQL Server商業(yè)智能開發(fā)工作室。其他主要的BI組件集成服務,分析服務OLAP,Analysis Services數(shù)據(jù)挖掘,和Reporting Services明顯不同,在SQL Server 2005中改進的SQL Server 2005關系數(shù)據(jù)庫包含幾個顯著的新功能。雖然微軟Office的查詢和門戶工具不是SQL Server的一部分,目前的版本將繼續(xù)在SQL Server 2005中的工作。在Microsoft Office的BI功能將與Office產品發(fā)布周期演變。2、我是從哪里啟
42、動SQL Server 2005? 你會發(fā)現(xiàn)在安裝SQL Server 2005的第一件事就是安裝體驗集成。你不再需要單獨的一些功能,如分析服務運行安裝程序。如果某個功能,如報表服務不適用于安裝,您的計算機可能無法滿足該功能的安裝要求。您可以查看自述文件功能的先決條件的完整討論。安裝過程中會安裝:SQL Server關系數(shù)據(jù)庫引擎集成服務分析服務報表服務SQL Server管理工具(數(shù)據(jù)庫管理工具集)商業(yè)智能開發(fā)套件(BI應用程序開發(fā)工具集)報表服務需要IIS中安裝并正確配置。我們強烈建議您花時間來配置和安裝IIS,如報表服務是SQL Server 2005商業(yè)智能功能集的一個組成部分。客戶熟
43、悉Analysis Services的可能由于缺少分析服務的元數(shù)據(jù)存儲庫而感到驚訝。在SQL Server 2000 Analysis Services存儲庫被運為Microsoft Access數(shù)據(jù)庫,但Analysis Services 2005中不包含元數(shù)據(jù)信息庫。相反Analysis Services數(shù)據(jù)庫元數(shù)據(jù)信息都存儲為XML文件,并通過分析服務的管理。這些XML文件可以放置在源代碼控制,如果需要的話。我們建議您使用Business Intelligence Development Studio中開發(fā)和SQL Server Management Studio來操作和維護BI數(shù)據(jù)庫對
44、象。您可以設計Integration Services包和Analysis Services多維數(shù)據(jù)集和挖掘模型在SQL Server Management Studio中,但商業(yè)智能開發(fā)套件提供了設計和調試BI應用帶來更好的體驗。你可能會學到更多與新的應用程序開始,而不是升級現(xiàn)有DTS包和Analysis Services數(shù)據(jù)庫。如果你有一個可用的,您可能會發(fā)現(xiàn)它很有用可重新創(chuàng)建現(xiàn)有的包或數(shù)據(jù)庫。當你更熟悉新的工具,功能和概念,很容易升級現(xiàn)有的對象。 許多客戶將使用SQL Server工具來開發(fā)一個系統(tǒng),使用Integration Services養(yǎng)活一個維度關系型數(shù)據(jù)倉庫,而這又是用來填充
45、Analysis Services數(shù)據(jù)庫的一個或多個源系統(tǒng)的現(xiàn)在熟悉的商業(yè)智能架構。但是,SQL Server 2005提供了許多選項,從這個通用設計的偏離,通過消除或虛擬化不同的組件。3、Relational數(shù)據(jù)倉庫 在SQL Server 2005關系數(shù)據(jù)庫引擎包含幾個功能的極大興趣為數(shù)據(jù)倉庫風格的應用程序的設計和維護。這些功能包括: 表分區(qū)實現(xiàn)快速數(shù)據(jù)加載和簡化維護非常大的表。輕松創(chuàng)建一個報告服務器的Transact-SQL的改進,包括新的數(shù)據(jù)類型和新的分析功能: 聯(lián)機索引操作 細粒度的備份/恢復操作 快速初始化文件4、Reporting服務器 一種常用方法卸載從交易數(shù)據(jù)庫中的關系運營報
46、告是維持一個報告服務器。報告服務器維護交易數(shù)據(jù)庫的圖像存在一定的延遲,通常前一天。報告服務器用于大多數(shù)報告和數(shù)據(jù)倉庫提取。 微軟SQL Server 2005中新增了兩項功能,數(shù)據(jù)庫鏡像和數(shù)據(jù)庫快照。這使得它非常容易創(chuàng)建和維護一個報告服務器?,F(xiàn)在SQL Server報表服務器每一天可以有更低的延遲比。此外,報告服務器的目的是充當交易系統(tǒng)的備用系統(tǒng)。 要創(chuàng)建一個報表服務器,首先創(chuàng)建一個數(shù)據(jù)庫鏡像,一個新的SQL Server 2005的功能,它提供了高可用性的即時備份系統(tǒng)。你可以閱讀SQL Server聯(lián)機叢書主題“數(shù)據(jù)庫鏡像概念”獲取更多信息。數(shù)據(jù)庫鏡像不能直接查詢,這就是第二次的新功能,數(shù)據(jù)
47、庫快照,就變得很重要。 在創(chuàng)建鏡像數(shù)據(jù)庫快照提供了額外的數(shù)據(jù)副本進行報告。數(shù)據(jù)庫快照是數(shù)據(jù)庫在特定時間點的只讀副本,并且不會自動與新的信息源數(shù)據(jù)庫的更改時更新。數(shù)據(jù)庫快照是與眾多使用場景多方面的主題;你可以閱讀SQL Server聯(lián)機叢書主題“了解數(shù)據(jù)庫視圖”了解更多信息。但就目前而言,它是足夠注意,快照存儲其信息的方式是非常節(jié)省空間。多個數(shù)據(jù)庫快照可以存在,為報告目的的數(shù)據(jù)的完整視圖,雖然維護一個數(shù)據(jù)庫快照不會對交易數(shù)據(jù)庫中的一些影響賴以數(shù)據(jù)庫快照的基礎。 通過在數(shù)據(jù)庫鏡像創(chuàng)建一個數(shù)據(jù)庫快照,您可以有效地創(chuàng)建一個備份服務器系統(tǒng)的高可用性。那么數(shù)據(jù)庫快照可以成為雙重責任作為報表服務器,以及在高
48、可用性解決方案中使用。5、Table分區(qū) 分區(qū)表和索引都他們的數(shù)據(jù)分為水平單位,讓行組映射到單個分區(qū)上的數(shù)據(jù),如查詢執(zhí)行的操作,執(zhí)行好像整個表或索引是一個單獨的實體。分區(qū)可以: 提高表和索引管理。 提高多CPU機器的查詢性能。 在關系型數(shù)據(jù)倉庫中,事實表是顯而易見的候選表分區(qū)和分區(qū)按日期范圍是最常見的分區(qū)策略。 有三個步驟來定義一個分區(qū)表,在SQL Server聯(lián)機叢書主題“創(chuàng)建分區(qū)表和索引”中所述。三個步驟如下: 創(chuàng)建分區(qū)函數(shù)指定如何使用該函數(shù)的表進行分區(qū)。 創(chuàng)建分區(qū)方案指定分區(qū)函數(shù)的分區(qū)如何被放置在文件組中。 創(chuàng)建使用分區(qū)方案的表或索引。 多個表可以使用相同的分區(qū)方案。 本文討論了事實表的
49、范圍分區(qū),不打算成為一個完整的討論或教程表分區(qū)。有關表分區(qū)的更多信息,請參閱SQL Server聯(lián)機叢書。 最常見的分區(qū)方案的事實表按日期范圍,如年,季,月,甚至一天來分區(qū)。在大多數(shù)情況下大多的表,或表的日期劃分,提供了最大的管理效益。為了獲得更好的查詢性能,在時間維度表,應使用相同的分區(qū)方案進行分區(qū)。6、Using表分區(qū)進行快速數(shù)據(jù)加載 大多數(shù)數(shù)據(jù)倉庫應用掙扎著越來越加載大量數(shù)據(jù)在一個小 - 和收縮 - 加載窗口。典型的過程開始于從多個源系統(tǒng)中提取,隨后步驟清潔,變換,綜合,和理順在這些系統(tǒng)中的數(shù)據(jù)。數(shù)據(jù)管理應用程序被限制在完成載入窗口內的整個提取,轉換和加載過程。通常,系統(tǒng)的商業(yè)用戶有很強
50、的需求,該系統(tǒng)減少的時間量的數(shù)據(jù)倉庫是他們的查詢不可用。 “寫”步驟中的數(shù)據(jù)管理應用程序,其中,所述新的數(shù)據(jù)被插入到現(xiàn)有的數(shù)據(jù)倉庫中,必須設計成能迅速發(fā)生并以最小的用戶的影響。 為了加載數(shù)據(jù)的速度非常快,將數(shù)據(jù)庫恢復模式必須是批量記錄或簡單,并且該表必須是空的或包含數(shù)據(jù),但是沒有索引。如果這些條件得到滿足,一個未記錄的負載是可能的。在SQL Server 2000中,分區(qū)表存在之前,這些條件通常只在最初的歷史數(shù)據(jù)倉庫負載滿足。有些客戶與大型數(shù)據(jù)倉庫已經(jīng)建立了一個準分區(qū)結構,通過構造一個UNION ALL視圖在單獨的物理表;這些表是用未記錄技術填充每個負載周期。這種方法并不完全令人滿意,而SQL
51、 Server 2005分區(qū)表提供卓越的功能。 在SQL Server 2005中,您不能執(zhí)行非日志記錄直接加載到一個分區(qū)。但是,您可以加載到一個單獨的表,我們將調用偽分區(qū)。在某些情況下,您可以切換偽分區(qū)到分區(qū)表作為極其迅速發(fā)生元數(shù)據(jù)操作。這種技術滿足最小化整體加載時間我們兩個要求:無日志記錄進行偽分區(qū)負載,并減少最終用戶的影響,并確保數(shù)據(jù)倉庫的完整性:偽分區(qū)可以在用戶查詢的數(shù)據(jù)倉庫加載。數(shù)據(jù)管理應用程序可以等待,直到被加載并準備執(zhí)行分區(qū)切換之前的所有事實表。發(fā)生該分區(qū)開關操作速度非常快,亞秒響應的順序。 此外,偽分區(qū)可以備份為一個單獨的表,提高了系統(tǒng)的可管理性。7、Using表分區(qū)進行快速數(shù)據(jù)刪除 許多數(shù)據(jù)倉庫保持活躍的數(shù)據(jù)倉庫的詳細數(shù)據(jù)的滑動窗口。例如,事實表中可能包含的數(shù)據(jù)用于三年,五年,或十年。每隔一段時間,最早的數(shù)據(jù)從表中刪除。主要的原因保持修剪數(shù)據(jù)來提高查詢性能并最大限度地減少儲存成本。 SQL Server 2005中的分區(qū)使它很容易修剪從一個大分區(qū)的事實表的舊數(shù)據(jù)。如上所述簡單地創(chuàng)建一個空的偽分區(qū),然后將其切換到分區(qū)表。分區(qū)表中有一個空的分區(qū)在那里曾經(jīng)有一個人口稠密的分區(qū)稱為偽分區(qū)有數(shù)據(jù)的地方。偽分區(qū)可以備份,截斷或下降(如適用)?;蛘吣憧梢赃x擇重新定義分區(qū)函數(shù)合并所有的左(空)分區(qū)連成一片。
- 溫馨提示:
1: 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
2: 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權益歸上傳用戶所有。
3.本站RAR壓縮包中若帶圖紙,網(wǎng)頁內容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
4. 未經(jīng)權益所有人同意不得將文件中的內容挪作商業(yè)或盈利用途。
5. 裝配圖網(wǎng)僅提供信息存儲空間,僅對用戶上傳內容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內容本身不做任何修改或編輯,并不能對任何下載內容負責。
6. 下載文件中如有侵權或不適當內容,請與我們聯(lián)系,我們立即糾正。
7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。