Load testing is a wide and established area of IT knowledge and software development practices. There are many professionals who specialize here and testing gurus ready to provide useful advices and even teach you a theory on the subject. Surprisingly, the mentioned gurus often do not agree with each other on the very basic terms used in this field.
If you search for information on load testing, most probably you will also find articles mentioning such terms as “performance testing” and “stress testing”. Are they all just synonyms? Everybody agrees that they are not, but still different sources provide different definitions for these terms.
The most confusing point is the difference between performance and load testing. Some people reasonably say that since the performance of an application can be measured without creating any load on it, the load testing is a subset of performance testing. Other variants of performance testing may include measuring various parameters that do not depend on the load at all, such as time required to render a web page in the browser, or to perform any other action on the client side.
When talking about stress testing, all agree that this is a type of testing when the server is stressed with a load above normal, and sometimes even beyond peak estimation for the tested application. However for what purpose is this done? Some say that this is just a way to check how the server responds to the rapidly growing load.
In my own opinion such mess in terms is produced by marketing efforts of companies selling testing tools. They want to satisfy expectations of every potential customer coming to their web site. That is why they are providing similar descriptions for all three types of testing mentioned above. In other words, they do not want to lose customers who understand these terms differently. This would be really a dramatic loss taking in account that all the same tools are used for all types of load testing.
Since I am not concentrated on selling anything to any particular customer right now, I have a freedom of developing a theory that would serve better understanding of the subject. So, no matter if any guru likes my classification, here it is.
Load testing. I prefer to think of load testing as of a blanket term for all other types of testing that are done under the emulated load. Basically each of them can be described and distinguishing from other ones by specifying the following test options.
- The main goal of test execution.
- The type and volume of applied load (it may be changing throughout the test).
- What parameters are measured and monitored when the test is performed.
- Additional actions performed with the tested system during the test.
In this type of test we gradually increase the load by adding more and more virtual users to the test and check the performance parameters of the system at each test phase.
The main things we monitor are:
- Web site response time;
- Number of processed requests per second;
- Error rate.
As a result we have a graph showing performance parameters for each load level. So we can tell, for example, what response time we can expect under the estimated load. Since we also have the information on how it is changing throughout the test, we can also predict if this parameter can be improved by upgrading hardware and if it is stable.
Again, we add virtual users gradually, but in this case we know the performance criteria in advance and just need to check that they are observed. When the performance starts to degrade significantly or just goes below our quality standard, we make the conclusion that the capacity limit is reached.
The purposes of stress testing are:
- Find that limit (in this respect it is similar to the capacity test);
- Check that when it is reached, the web site handle the stress correctly: produces graceful overload notifications and does not crash;
- When the load is reduced back to regular level, the web site returns to the normal operation retaining the performance parameters.
In my opinion it is very important to mention last two goals, because they show the specificity of stress testing.
This is a bit strange, because I would recommend establishing such standards basing on your business requirements. Nevertheless I can imagine one case when such testing is really applicable. If you already have a live web site and you know that it is working more or less acceptable (you can have a good perception of it by checking cash in your pockets), you may perform baseline testing of that system to convert that perception to a more exact parameters, such as response time. After that you will be able to compare the performance of any new version of your web site with the initial data.
For endurance testing it is recommended to use changing periodic load to provoke resource reallocation. When the test is over we should compare resource usage and performance parameters on the early stages and at the end of test.
Note that in terms of the number of virtual users the load may remain on a regular estimated level throughout the test. We should already know the expected performance parameters for such load, so our goal is to check that they are not affected significantly by the above mentioned changes in the test data.
For example, imagine a scalable system that can allocate additional resources when the load is increased. While it can work perfectly with the high target volume, it may experience performance problems during resource allocation or just fail to do this correctly under such extreme load change.
In this test we monitor two things.
- How the performance parameters are affected by the introduced failure.
- What happens when the system comes back to normal conditions.
While it may be acceptable for the overall system performance to degrade temporary for a certain amount of time necessary to fix the failure, it is imperative that it is fully recovered after eliminating the problem. This is similar to the stress testing, but in this case the stress is produced not by the excessive load, but by a temporary problem inside the tested system.
Well, this is the end of my list. One more thing that is worth mentioning is that the load testing in general is not completely separate and different from other testing practices. Some think that it is only reasonable to test how an application behaves under load at the very end of the development process just before it goes in production. This is not so. Of course, any load tests should be applied only after functional testing; otherwise the results will not be correct and useful. However you can integrate various types of load tests into your regular development process and use them as part of regression testing performed on each new build or version of your web application.