Posts Tagged ‘performance’

Performance vs. Scalability

September 11, 2008

When people talk about performance and scalability they very often use these two word synonymously. However they mean different things. As there is a lot of misunderstanding on that topic, I thought it makes sense to have a blog post on it.

One of the best explanations can be found here.  It is a nice explanation by Werner Vogels the CTO of amazon.  I think everybody agrees that he knows what he is talking about.

Performance refers to the capability of a system to provide a certain response time. server a defined number of users or process a certain amount of data.  So performance is a software quality metric.  Unlike to what many people think it is not vage, but can be defined in numbers.

If we realize that our performance requirements change (e.g. we have to serve more users, we have to provide lower response times) or we cannot meet our performance goals, scalability comes into play.

Scalability referes to the characteristic of a system to increase performance by adding additional ressources. Very often people think that there system are scalabable out-of-the-box. “If we need to server more users, we just add additional server” is a typical answer for performance problems.

However this assumes that the system is scalable, meaning adding additional resources really helps to improve performance.  Whether your system is scalable or not depends on your architecture.  Software systems not having scalability as a design goal often do not provide good scalabilty.  This InfoQ interview with Cameron Purdy – VP of Development in Oracle’s Fusion Middleware group and former Tangosol CEO – provides a good example on limited scalability of a system.  There are also two nice artilces by Sun’s Wang Yu on Vertical and Horizontal Scalabilty.

So how does this relate to dynaTrace.  With Lifecycle APM we defined an approach how to ensure performance and scalability over the application lifecycle – from development to production.  We work with our customers to make performance management part of their software processes going beyond performance testing and firefighting when there are problems in production.

As scalabilty problem are in nearly all cases architectural problems, these charateristcs have to be tested already in the development phase. dynaTrace provides means to integrate and automate performance management in your Continuous Integration Environments.

When I talk to people I sometimes get the feedback “… isn’t that premature optimization” (have a look at the cool image on premature optimization in K. Scott Allen’s Blog). This is a strong misconception. Premature optimization would mean that we always try to do performance optimization whenever and wherever we can.  Lifecycle APM and Continuous Performance Management as the development part of it, targets to get all information to always know about the scalabilty and performance characteristcs of your application. This serves as a basis for deciding when and where to optimize; actually avoiding premature optimization in the wrong direction.

Concluding we can say that if we want our systems to be scalable we have to take this into consideration right from the beginning of development and also monitor throuhout the lifecycle.  If we have to ensure it, we have to monitor it. This means that performance management must then treated equally relevant than the management of functional requirements.

ASP.NET Page LifeCycle X-Ray’d

August 1, 2008

There are many good articles on the web covering ASP.NET Page LifeCycle – published my Microsoft on MSDN or by professionals in .NET related blogs. dynaTrace allowed me to dive deeper into the Page LifeCycle seeing the impact of my implemented OnInit, OnPreRender, … methods of my pages, web parts and controls when my application actually runs under production load.

In order to do Application Performance Management you have to understand all the libraries, frameworks and 3rd party software that you use in your application. ASP.NET is the main framework that you work. So you better understand what is going on when a page request is executed in order to know where not to do costly code executions.

ASP.NET Extended Knowledge Sensor

I’ve created a new Knowledge Sensor Pack that extends the out of the box ASP.NET Sensors. It includes support for ASP.NET Pages, WebParts, Custom Controls and ViewState handling. Using this KSP to analyse DotNetPay – dynaTrace’s .NET Sample Application – gives us the following diagnostics information:

API Breakdown

Performing a Outside-In Diagnose I start with the API Breakdown. The API Breakdown shows us the main performance contributors of an ASP.NET Page Request:

ASP.NET Page Request API Breakdown

ASP.NET Page Request API Breakdown

I can see the performance broken down into the different layes:

  • ASP.NET WebPage
  • ASP.NET UserControls
  • ASP.NET ViewState
  • and general ASP.NET

It turns out that my custom ASP.NET Controls have a major impact on the overall performance.

Identifying poor performing Custom ASP.NET Controls

From the API Breakdown View I switch to the Method View. This view allows us to get an overview of those methods that actually contribute to the bad performance.

List of custom ASP.NET methods

List of custom ASP.NET methods

From an individual method we can then get to the PurePath to analyse the reason of the bad performance

Analysing root cause of slow custom ASP.NET Controls

The PurePath shows us where in the ASP.NET Page LifeCycle my custom control was actually executed and why the overall execution time of this control was slow

Shows the PurePath of slow performing Custom ASP.NET Control

Shows the PurePath of slow performing Custom ASP.NET Control

Conclusion 

Getting insight into the used frameworks is crucial for you in order to write performing and scalable applications. dynaTrace gives you the option to see behind the curtains.

ASP.NET Page LifeCycle X-Ray’d

August 1, 2008

There are many good articles on the web covering ASP.NET Page LifeCycle – published my Microsoft on MSDN or by professionals in .NET related blogs. dynaTrace allowed me to dive deeper into the Page LifeCycle seeing the impact of my implemented OnInit, OnPreRender, … methods of my pages, web parts and controls when my application actually runs under production load.

In order to do Application Performance Management you have to understand all the libraries, frameworks and 3rd party software that you use in your application. ASP.NET is the main framework that you work. So you better understand what is going on when a page request is executed in order to know where not to do costly code executions.

ASP.NET Extended Knowledge Sensor

I’ve created a new Knowledge Sensor Pack that extends the out of the box ASP.NET Sensors. It includes support for ASP.NET Pages, WebParts, Custom Controls and ViewState handling. Using this KSP to analyse DotNetPay – dynaTrace’s .NET Sample Application – gives us the following diagnostics information:

API Breakdown

Performing a Outside-In Diagnose I start with the API Breakdown. The API Breakdown shows us the main performance contributors of an ASP.NET Page Request:

ASP.NET Page Request API Breakdown

ASP.NET Page Request API Breakdown

I can see the performance broken down into the different layes:

  • ASP.NET WebPage
  • ASP.NET UserControls
  • ASP.NET ViewState
  • and general ASP.NET

It turns out that my custom ASP.NET Controls have a major impact on the overall performance.

Identifying poor performing Custom ASP.NET Controls

From the API Breakdown View I switch to the Method View. This view allows us to get an overview of those methods that actually contribute to the bad performance.

List of custom ASP.NET methods

List of custom ASP.NET methods

From an individual method we can then get to the PurePath to analyse the reason of the bad performance

Analysing root cause of slow custom ASP.NET Controls

The PurePath shows us where in the ASP.NET Page LifeCycle my custom control was actually executed and why the overall execution time of this control was slow

Shows the PurePath of slow performing Custom ASP.NET Control

Shows the PurePath of slow performing Custom ASP.NET Control

Conclusion 

Getting insight into the used frameworks is crucial for you in order to write performing and scalable applications. dynaTrace gives you the option to see behind the curtains.

ASP.NET GridView Performance

July 10, 2008
ASP.NET offers a powerful GridView control that can be used to display data from different data sources, e.g.: SQLServer, LINQ, XML, …  The control additionally supports features like paging, sorting and editing.

Visual Studio makes it very easy to use this control on your web page and to bind it to a data source like a SQL Server table. It only takes several drag&drop operations on your page and a few more clicks to configure which data you want to display and which features of the GridView you want to enable.

This easy-to-use approach is nice and it works well for many use case scenarios. From our experience we however know that easy-to-use and flexible components are not always concluding in high performance.

The Sample Implementation: 3 different ways to use a GridView control

I’ve created a sample ASP.NET Web Site with 3 different pages. Each page hosts a GridView control to display the content of the Northwind Products Table. I added 10000 additional product records in order to show a more real-life scenario than to query a table with only a few records.

The difference between the 3 pages is the underlying data source:

  • The first page used a standard SQL Data Source
  • The second page used the same SQL Data Source but enabled Data Caching for 5 seconds
  • The third page used the new Entity Data Source.

All of the Grid Views support Paging, Editing, Selection and Deletion.

The Performance Test

I ran a simple load test on each of the individual pages. The workload was a 20 user load test over 5 minutes where each user executed the following actions:

  • queried the initial page
  • sorted the grid by product column
  • clicked through 3 different grid pages
  • sorted the product column again

The Performance Results

I used dynaTrace to analyze the individual load test requests in terms of their database access and rendering activities.

Grid with standard ADO.NET Binding
It turned out that a GridView with a standard SQL Data Source is ALWAYS selecting ALL rows (SELECT * FROM Products) of the underlying table – even if the PageSize is smaller than the actual result set.

 Additionally – sorting is not done on the SQL Layer. The GridView is again retrieving ALL rows from the database (SELECT * FROM Products) and is then performing an In-Memory sort on the resultset. In my scenario – each page request executed a SQL Statement that returned more than 10000 rows although only 10 elements (that was the PageSize) were displayed.

Grid with standard ADO.NET Data Binding using Client Side Caching
Enabling Data Caching on the SQL Data Source of course limited the round trips to the database on my second page and therefore improved the overall performance. The Caching implementation is smart enough to already cache the sorted/paged data.

GridView with Entity Framework Data Source
Using the Entity Data Source on SQL Server turned out to be the best performing solution. Each page request actually resulted in two SQL Statements. One that queried that row identifiers to display based on the current page index and the other one to actually query ONLY THOSE rows that were actually displayed based on Page Index and Sorted Column. This scenario therefore limited the number of data that had to be retrieved from the database. Although more SQL Statements were executed on the database – the overall performance was improved by a large factor.

Following image shows the Web Requests that performed a request on the second data page.  Once on the GridView with standard data binding and once using ADO.NET Entity Framework.

Web Requests to 2 different Grid Pages

Web Requests to 2 different Grid Pages

For each individual Web Request we can see the resulting SQL Statements. Simple ADO.NET Data Mapping executes a full table select on each page request:

Database View showing SQL Statements of ADO.NET Data Binding

Database View showing SQL Statements of ADO.NET Data Binding

ADO.NET Entity Framework executes statements that just retrieve the data that must be displayed. This implies more work on the database but less work in ASP.NET to render and filter the data:

Database View showing SQL Statments for ADO.NET Entity Framework Binding

Database View showing SQL Statments for ADO.NET Entity Framework Binding

The Analysis

The following table compares individual measures from the 3 scenarios. We can see that twice as many requests could be handled by the page that used the Entity Data Source with only 1/10 of CPU usage and with an average response time that was 24 times faster compared to the SQL Data Source without caching.

Web Request View comparing all 3 Scenarios

Web Request View comparing all 3 Scenarios

The most critical performance impact in this scenario was that the SQL Data Source just requested TO MUCH data that had to be transfered from the database to the web applications. All this data than had to be processed where only a small part of it was actually displayed to the user.

Conclusion

ONLY REQUEST THE DATA THAT YOU NEED

Performance Antipattern : Logical Structure vs. Physical Deployment

July 9, 2008

A very common performance anti-pattern is wrong deployment of components. This often comes when applications are deployed as they are designed at a conceptual leve. Most web-based applications today are build based on the Model-View-Controller concept. This means that components are seperated into :

  • A view part responsible for presenting a user interface to the end user.
  • A controller part containing the interaction logical as well as business logic.
  • A model part consisting of the actual data and some logical relations in the data.

When it comes to deployment a very frequently chosen scenario is splitting the system into a frontend system and a backend system. Especially for J EE applications this is a typical deployment scenario.

In some situations this deployment approach may cause a serious performance problem. This is specially true for data-intensive applications. In a modern application we might run into a situation as shown below. We have a data-intensive task which retrieves data from the database processes it in business components and then forwards the information to the user.

Multi Tier Deployment

Multi Tier Deployment


By moving the all logic we need for this task on one server we can reduce the amout of data sent over the network as well as the CPU load required for serialization and deserialization. Additionally the number of objects that have to be created can massivley be reduced. Serialization also requires interim objects to be created e.g. when converting an object model into an XML structure or the other way round.

The dynaTrace Remoting and WebServices Views provide an excellent way for anaylzing remoting overhead on a transactional basis. This information is vital for identifying these kinds of deployment problems. Below there is a sample of a web service including the data amount transferred as well as networking and remoting times.

dynaTrace Remoting and Web Service Views

dynaTrace Remoting and Web Service Views


SharePoint ListItem Performance

July 7, 2008

SharePoint provides a powerful object model to retrieve and manipulate data stored in SharePoint Lists. Its possible to query the data by retrieving all content from a list or view – or by executing a so called CAML (Collaborative Markup Language) Query

Real-Life GetItemById Problem
I’ve recently been working with a customer who faced performance problems with their custom developed web parts. The Web Parts allowed a user to view and manipulate account information which was stored in different SharePoint lists.It seemed the page got slower the more accounts where stored in the lists. When a user clicked to view the account details the information was queried and displayed. When we analyzed the request we immediatelly saw the problem.

The SPListItemCollection.GetItemById was used with the assumption that only the data for the requested Item was queried from the database. However – using GetItemById on the Items Property of an SPList object in fact queries ALL items from the database. Then the resultset is filtered in memory to only return the requested item.

The Solution
Changing this request to a call to SPList.GetItems(camlQuery) improved the call performance tremendously.

Just with this change we improved the performance of the WebPart by more than 2 seconds.There are many blogs out there which discuss the correct usage of the SharePoint Data Interfaces. GetItem vs. GetItemById is a heavily discussed topic amoung others. The solutions presented on those blogs are basically the same as we have identified with this customer.

Conclusion
The most important thing for SharePoint developers is to really understand whats going on within the Framework (SharePoint) when being used. Someone can really kill performance with inappropriate usage of the different query options that SharePoint provides.

Will post more SharePoint related Performance Topics