計算機科學(xué)與技術(shù) 外文翻譯 外文文獻 英文文獻 提高字符串處理性能的ASP應(yīng)用程序
1所譯外文資料:Improving String Handling Performance in ASP Applications作者:James Musson書名(或論文題目): 出 版 社(或刊物名稱):Developer Services, Microsoft UK出版時間(或刊號):March 2003外文資料原文摘自:Improving String Handling Performance in ASP ApplicationsJames MussonDeveloper Services, Microsoft UKMarch 2003Summary: Most Active Server Pages (ASP) applications rely on string concatenation to build HTML-formatted data that is then presented to users. This article contains a comparison of several ways to create this HTML data stream, some of which provide better performance than others for a given situation. A reasonable knowledge of ASP and Visual Basic programming is assumed. (11 printed pages)ContentsIntroductionASP DesignString ConcatenationThe Quick and Easy SolutionThe StringBuilderThe Built-in MethodTestingResultsConclusionIntroductionWhen writing ASP pages, the developer is really just creating a stream of formatted text that is written to the Web client via the Response object provided by ASP. You can build this text stream in several different ways and the method you choose can have a large impact on both the performance and the scalability of the Web application. On numerous occasions in which I have helped customers with performance-tuning their Web applications, I have found that one of the major wins has come from changing the way that the HTML stream is created. In this article I will show a few of the common techniques and test what effect they have on the performance of a simple ASP page.ASP DesignMany ASP developers have followed good software engineering principles and modularized their code wherever possible. This design normally takes the form of a number of include files that contain functions modeling particular discrete sections of a page. The string outputs from these functions, usually HTML table code, can then be used in various combinations to build a complete page. Some developers have taken this a stage further and moved these HTML functions into Visual Basic COM components, hoping to benefit from the extra performance that compiled code can offer.Although this is certainly a good design practice, the method used to build the strings that form these discrete HTML code components can have a large bearing on how well the Web site performs and scalesregardless of whether the actual operation is performed from within an ASP include file or a Visual Basic COM component.String ConcatenationConsider the following code fragment taken from a function called WriteHTML. The parameter named Data is simply an array of strings containing some data that needs to be formatted into a table structure (data returned from a database, for instance).Copy CodeFunction WriteHTML( Data )Dim nRepFor nRep = 0 to 99 sHTML = sHTML & vbcrlf _ & "<TR><TD>" & (nRep + 1) & "</TD><TD>" _ & Data( 0, nRep ) & "</TD><TD>" _ & Data( 1, nRep ) & "</TD><TD>" _ & Data( 2, nRep ) & "</TD><TD>" _ & Data( 3, nRep ) & "</TD><TD>" _ & Data( 4, nRep ) & "</TD><TD>" _ & Data( 5, nRep ) & "</TD></TR>"NextWriteHTML = sHTMLEnd FunctionThis is typical of how many ASP and Visual Basic developers build HTML code. The text contained in the sHTML variable is returned to the calling code and then written to the client using Response.Write. Of course, this could also be expressed as similar code embedded directly within the page without the indirection of the WriteHTML function. The problem with this code lies in the fact that the string data type used by ASP and Visual Basic, the BSTR or Basic String, cannot actually change length. This means that every time the length of the string is changed, the original representation of the string in memory is destroyed, and a new one is created containing the new string data: this results in a memory allocation operation and a memory de-allocation operation. Of course, in ASP and Visual Basic this is all taken care of for you, so the true cost is not immediately apparent. Allocating and de-allocating memory requires the underlying runtime code to take out exclusive locks and therefore can be expensive. This is especially apparent when strings get big and large blocks of memory are being allocated and de-allocated in quick succession, as happens during heavy string concatenation. While this may present no major problems in a single user environment, it can cause serious performance and scalability issues when used in a server environment such as in an ASP application running on a Web server.So back to the code fragment above: how many string allocations are being performed here? In fact the answer is 16. In this situation every application of the '&' operator causes the string pointed to by the variable sHTML to be destroyed and recreated. I have already mentioned that string allocation is expensive, becoming increasingly more so as the string grows; armed with this knowledge, we can improve upon the code above.The Quick and Easy SolutionThere are two ways to mitigate the effect of string concatenations, the first is to try and decrease the size of the strings being manipulated and the second is to try and reduce the number of string allocation operations being performed. Look at the revised version of the WriteHTML code shown below.Copy CodeFunction WriteHTML( Data )Dim nRepFor nRep = 0 to 99 sHTML = sHTML & ( vbcrlf _ & "<TR><TD>" & (nRep + 1) & "</TD><TD>" _ & Data( 0, nRep ) & "</TD><TD>" _ & Data( 1, nRep ) & "</TD><TD>" _ & Data( 2, nRep ) & "</TD><TD>" _ & Data( 3, nRep ) & "</TD><TD>" _ & Data( 4, nRep ) & "</TD><TD>" _ & Data( 5, nRep ) & "</TD></TR>" )NextWriteHTML = sHTMLEnd FunctionAt first glance it may be difficult to spot the difference between this piece of code and the previous sample. This one simply has the addition of parentheses around everything after sHTML = sHTML &. This actually reduces the size of strings being manipulated in most of the string concatenation operations by changing the order of precedence. In the original code sample the ASP complier will look at the expression to the right of the equals sign and just evaluate it from left to right. This results in 16 concatenation operations per iteration involving sHTML which is growing all the time. In the new version we are giving the compiler a hint by changing the order in which it should carry out the operations. Now it will evaluate the expression from left to right but also inside out, i.e. inside the parentheses first. This technique results in 15 concatenation operations per iteration involving smaller strings which are not growing and only one with the large, and growing, sHTML. Figure 1 shows an impression of the memory usage patterns of this optimization against the standard concatenation method.Figure 1 Comparison of memory usage pattern between standard and parenthesized concatenationUsing parentheses can make quite a marked difference in performance and scalability in certain circumstances, as I will demonstrate later in this article.The StringBuilderWe have seen the quick and easy solution to the string concatenation problem, and for many situations this may provide the best trade-off between performance and effort to implement. If we want to get serious about improving the performance of building large strings, however, then we need to take the second alternative, which is to cut down the number of string allocation operations. In order to achieve this a StringBuilder is required. This is a class that maintains a configurable string buffer and manages insertions of new pieces of text into that buffer, causing string reallocation only when the length of the text exceeds the length of the string buffer. The Microsoft .NET framework provides such a class for free (System.Text.StringBuilder) that is recommended for all string concatenation operations in that environment. In the ASP and classic Visual Basic world we do not have access to this class, so we need to build our own. Below is a sample StringBuilder class created using Visual Basic 6.0 (error-handling code has been omitted in the interest of brevity).Copy CodeOption Explicit' default initial size of buffer and growth factorPrivate Const DEF_INITIALSIZE As Long = 1000Private Const DEF_GROWTH As Long = 1000' buffer size and growthPrivate m_nInitialSize As LongPrivate m_nGrowth As Long' buffer and buffer countersPrivate m_sText As StringPrivate m_nSize As LongPrivate m_nPos As LongPrivate Sub Class_Initialize() ' set defaults for size and growth m_nInitialSize = DEF_INITIALSIZE m_nGrowth = DEF_GROWTH ' initialize buffer InitBufferEnd Sub' set the initial size and growth amountPublic Sub Init(ByVal InitialSize As Long, ByVal Growth As Long) If InitialSize > 0 Then m_nInitialSize = InitialSize If Growth > 0 Then m_nGrowth = GrowthEnd Sub' initialize the bufferPrivate Sub InitBuffer() m_nSize = -1 m_nPos = 1End Sub' grow the bufferPrivate Sub Grow(Optional MinimimGrowth As Long) ' initialize buffer if necessary If m_nSize = -1 Then m_nSize = m_nInitialSize m_sText = Space$(m_nInitialSize) Else ' just grow Dim nGrowth As Long nGrowth = IIf(m_nGrowth > MinimimGrowth, m_nGrowth, MinimimGrowth) m_nSize = m_nSize + nGrowth m_sText = m_sText & Space$(nGrowth) End IfEnd Sub' trim the buffer to the currently used sizePrivate Sub Shrink() If m_nSize > m_nPos Then m_nSize = m_nPos - 1 m_sText = RTrim$(m_sText) End IfEnd Sub' add a single text stringPrivate Sub AppendInternal(ByVal Text As String) If (m_nPos + Len(Text) > m_nSize Then Grow Len(Text) Mid$(m_sText, m_nPos, Len(Text) = Text m_nPos = m_nPos + Len(Text)End Sub' add a number of text stringsPublic Sub Append(ParamArray Text() Dim nArg As Long For nArg = 0 To UBound(Text) AppendInternal CStr(Text(nArg) Next nArgEnd Sub ' return the current string data and trim the bufferPublic Function ToString() As String If m_nPos > 0 Then Shrink ToString = m_sText Else ToString = "" End IfEnd Function' clear the buffer and reinitPublic Sub Clear() InitBufferEnd SubThe basic principle used in this class is that a variable (m_sText) is held at the class level that acts as a string buffer and this buffer is set to a certain size by filling it with space characters using the Space$ function. When more text needs to be concatenated with the existing text, the Mid$ function is used to insert the text at the correct position, after checking that our buffer is big enough to hold the new text. The ToString function returns the text currently stored in the buffer, also trimming the size of the buffer to the correct length for this text. The ASP code to use the StringBuilder would look like that shown below.Copy CodeFunction WriteHTML( Data )Dim oSBDim nRepSet oSB = Server.CreateObject( "StringBuilderVB.StringBuilder" )' initialize the buffer with size and growth factoroSB.Init 15000, 7500For nRep = 0 to 99 oSB.Append "<TR><TD>", (nRep + 1), "</TD><TD>", _ Data( 0, nRep ), "</TD><TD>", _ Data( 1, nRep ), "</TD><TD>", _ Data( 2, nRep ), "</TD><TD>", _ Data( 3, nRep ), "</TD><TD>", _ Data( 4, nRep ), "</TD><TD>", _ Data( 5, nRep ), "</TD></TR>"NextWriteHTML = oSB.ToString()Set oSB = NothingEnd FunctionThere is a definite overhead for using the StringBuilder because an instance of the class must be created each time it is used (and the DLL containing the class must be loaded on the first class instance creation). There is also the overhead involved with making the extra method calls to the StringBuilder instance. How the StringBuilder performs against the parenthesized '&' method depends on a number of factors including the number of concatenations, the size of the string being built, and how well the initialization parameters for the StringBuilder string buffer are chosen. Note that in most cases it is going to be far better to overestimate the amount of space needed in the buffer than to have it grow often.The Built-in MethodASP includes a very fast way of building up your HTML code, and it involves simply using multiple calls to Response.Write. The Write function uses an optimized string buffer under the covers that provides very good performance characteristics. The revised WriteHTML code would look like the code shown below.Copy CodeFunction WriteHTML( Data )Dim nRepFor nRep = 0 to 99 Response.Write "<TR><TD>" Response.Write (nRep + 1) Response.Write "</TD><TD>" Response.Write Data( 0, nRep ) Response.Write "</TD><TD>" Response.Write Data( 1, nRep ) Response.Write "</TD><TD>" Response.Write Data( 2, nRep ) Response.Write "</TD><TD>" Response.Write Data( 3, nRep ) Response.Write "</TD><TD>" Response.Write Data( 4, nRep ) Response.Write "</TD><TD>" Response.Write Data( 5, nRep ) Response.Write "</TD></TR>"NextEnd FunctionAlthough this will most likely provide us with the best performance and scalability, we have broken the encapsulation somewhat because we now have code inside our function writing directly to the Response stream and thus the calling code has lost a degree of control. It also becomes more difficult to move this code (into a COM component for example) because the function has a dependency on the Response stream being available.TestingThe four methods presented above were tested against each other using a simple ASP page with a single table fed from a dummy array of strings. The tests were performed using Application Center Test® (ACT) from a single client (Windows® XP Professional, PIII-850MHz, 512MB RAM) against a single server (Windows 2000 Advanced Server, dual PIII-1000MHz, 256MB RAM) over a 100Mb/sec network. ACT was configured to use 5 threads so as to simulate a load of 5 users connecting to the web site. Each test consisted of a 20 second warm up period followed by a 100 second load period in which as many requests as possible were made. The test runs were repeated for various numbers of concatenation operations by varying the number of iterations in the main table loop, as shown in the code fragments for the WriteHTML function. Each test run was performed with all of the various concatenation methods described so far.ResultsBelow is a series of charts showing the effect of each method on the throughput of the application and also the response time for the ASP page. This gives some idea of how many requests the application could support and also how long the users would be waiting for pages to be downloaded to their browser.Table 1 Key to concatenation method abbreviations usedMethod AbbreviationDescriptionRESPThe built-in Response.Write methodCATThe standard concatenation ('&') methodPCATThe parenthesized concatenation ('&') methodBLDRThe StringBuilder methodWhilst this test is far from realistic in terms of simulating the workload for a typical ASP application, it is evident from Table 2 that even at 420 repetitions the page is not particularly large; and there are many complex ASP pages in existence today that fall in the higher ranges of these figures and may even exceed the limits of this testing range.Table 2 Page sizes and number of concatenations for test samplesNo. of iterationsNo. of concatenationsPage size (bytes)152402,667304804,917457207,167609609,417751,20011,6671201,92018,5391802,88027,8992403,84037,2593004,80046,6193605,76055,9794206,72062,219Figure 2 Chart showing throughput resultsWe can see from the chart shown in Figure 2 that, as expected, the multiple Response.Write method (RESP) gives us the best throughput throughout the entire range of iterations tested. What is surprising, though, is how quickly the standard string concatenation method (CAT) degrades and how much better the parenthesized version (PCAT) performs up to over 300 iterations. At somewhere around 220 iterations the overhead inherent in the StringBuilder method (BLDR) begins to be outweighed by the performance gains due to the string buffering and at anything above this point it would most likely be worth investing the extra effort to use a StringBuilder in this ASP page.Figure 3 Chart showing response time resultsFigure 4 Chart showing response time results with CAT omittedThe charts in Figure 3 and 4 show response time as measured by Time-To-First-Byte in milliseconds. The response times for the standard string concatenation method (CAT) increase so quickly that the chart is repeated without this method included (Figure 4) so that the differences between the other methods can be examined. It is interesting to note that the multiple Response.Write method (RESP) and the StringBuilder method (BLDR) give what looks like a fairly linear progression as the iterations increase whereas the standard concatenation method (CAT) and the parenthesized concatenation method (PCAT) both increase very rapidly once a certain threshold has been passed.ConclusionDuring this discussion I have focused on how different string building techniques can be applied within the ASP environment; but don't forget that this applies to any scenario where you are creating large strings in Visual Basic code, such as manually creating XML documents. The following guidelines should help you decide which method might work best for your situation. · Try the parenthesized '&' method first, especially when dealing with existing code. This will have very little impact on the structure of the code and you might well find that this increases the performance of the application such that it exceeds your targets. · If it is possible without compromising the encapsulation level you require, use Response.Write. This will always give you the best performance by avoiding unnecessary in-memory string manipulation. · Use a StringBuilder for building really large, or concatenation-intensive, strings. Although you may not see exactly the same kind of performance increase shown in this article, I have used these techniques in real-world ASP web applications to deliver very good improvements in both performance and scalability for very little extra effort.2譯成中文:提高字符串處理性能的ASP應(yīng)用程序摘要:大多數(shù)的動態(tài)服務(wù)器主頁(ASP)的應(yīng)用軟件依賴于字串串連建立的HTML格式數(shù)據(jù),然后呈現(xiàn)給用戶。本文通過用幾種方法來創(chuàng)建此HTML數(shù)據(jù)流來進行比較,在某種特定的情況下,得出哪種方法的性能是最好的。我們假設(shè)已經(jīng)有了對ASP和VB的合理正確的認(rèn)識。(11頁)。導(dǎo)言開發(fā)者在寫ASP網(wǎng)頁時,其實只是在ASP提供的對象上的Web客戶建立一個流的格式文本。用戶可以通過幾種方式來建立這種文字流,但是用戶所選的方法對網(wǎng)頁應(yīng)用程序的性能和規(guī)?;兄艽蟮挠绊憽T谖?guī)椭蛻魞?yōu)化Web應(yīng)用程序性能的許多次實踐中,我發(fā)現(xiàn)最主要還是得利于對改變HTML流的編寫方法。在本文中我將介紹一些常用的技術(shù)和測試對一個簡單的ASP網(wǎng)頁有什么樣的影響。技術(shù)設(shè)計許多ASP的開發(fā)者都遵循好軟件工程的原則并且在可能的情況下,把代碼模塊化。這種設(shè)計通常采用許多子文件的形式,在一個網(wǎng)頁中,這些子文件具有把特定的不連續(xù)部分模型化的函數(shù)。通過此函數(shù)的輸出的字符串,通常形成HTML的表格編碼,然后通過各種組合連接起來,建立一個完整的網(wǎng)頁。一些開發(fā)商把這個作為一個平臺,進一步把HTML函數(shù)嵌入到Visual Basic COM組件中,希望得益于已編譯的這些代碼所能提供的額外性能。雖然這無疑是一個良好的設(shè)計實踐,常用的建立字符串的方法,形式的這些離散HTML代碼組件對網(wǎng)站執(zhí)行和規(guī)模有很大的關(guān)系(無論實際操作是從內(nèi)部的ASP子文件還是一個Visual Basic COM組件中進行的)。字串串連思考以下取自一個叫WriteHTML函數(shù)(功能)的代碼片段。參數(shù)指定的數(shù)據(jù)只是一排包含需要進行格式化成一個表結(jié)構(gòu)的數(shù)據(jù)字符串(假如數(shù)據(jù)從一個數(shù)據(jù)庫返回)。Function WriteHTML( Data )Dim nRepFor nRep = 0 to 99 sHTML = sHTML & vbcrlf _ & "<TR><TD>" & (nRep + 1) & "</TD><TD>" _ & Data( 0, nRep ) & "</TD><TD>" _ & Data( 1, nRep ) & "</TD><TD>" _ & Data( 2, nRep ) & "</TD><TD>" _ & Data( 3, nRep ) & "</TD><TD>" _ & Data( 4, nRep ) & "</TD><TD>" _ & Data( 5, nRep ) & "</TD></TR>"NextWriteHTML = sHTMLEnd Function這是ASP和Visual Basic開發(fā)者創(chuàng)建