Posts tagged: Methodology

Zen and the Art of Facebook Application Load Testing (Part 4)

25thWe’ve finally reached the promised land – this is the final post about Robert Pirsig’s book Zen and the Art of Motorcycle Maintenance . Thanks for sticking it out to those of you who’ve been following so far.

In the last post, we ended with an application that we can call nominally scalable. To get to this point we proved that the application scales well when going from one to two servers. In the process, we’ve figured out some important information such as how many virtual users a single instance of the application server can handle. Time now for the big test.

Run Test #3

This is pretty easy. Set up a deployment-quality infrastructure. Try and run SimulGoal users through it (remember that SimulGoal is the number of simultaneous users that the application must be able to support). See what happens.

Now some astute readers will recognize this as the same test that we warned about not doing back in Part 2. Aha, but things have changed now since we’ve followed a methodical plan where every step can either pass or fail, but will always teach us something about the system so that we can continue and be productive. We know a lot about the system at this point, so even if this test fails, it shouldn’t be too hard to hypothesize why, and then run another test to prove/disprove the hypothesis. In a short time you should be able to locate any problems still remaining, fix them and enjoy success.

And that’s the real point that Pirsig and Zen try to teach us – never finish a test where you scratch your head not knowing what happened. Always be making some progress. To do that, plan your tests out carefully, know what hypotheses you’re tryng to prove, and know which direction to go if the test passes/fails.

Conclusion

It’s taken us four posts to fully explain this methodology that we’ve developed while testing real Facebook applications. We’d love to hear your feedback  about what you think of this, both good and bad. How would you modify this to make it better?

Add to FacebookAdd to DiggAdd to Del.icio.usAdd to StumbleuponAdd to RedditAdd to BlinklistAdd to TwitterAdd to TechnoratiAdd to Yahoo BuzzAdd to Newsvine

Zen and the Art of Facebook Application Load Testing (Part 3)

HEB_zen_and_the_art_of_motorcycle_maintenanceIn the last post, we continued the discussion of how Robert Pirsig’s book Zen and the Art of Motorcycle Maintenance can help guide us in some of the decisions we need to make when load testing a Facebook application. We defined some goals that the test should have, and we developed a way to determine ServerGoal, or the maximum number of simultaneous users that can be serviced by a Minimal Viable Server Infrastructure (MVSI) configuration of the test application’s infrastructure.

Once we’ve done that, it’s time to start Test #2 and determine whether the application appears to be scalable.

Run Test #2

In Test #2 we’ll increase the size of the MVSI and see how many virtual users we can push through the system. This can be as simple as using two application servers instead of one and seeing what happens. The natural instinct here is to say “This is a waste of time – doubling my computational power will double my capacity.”. Resist the urge to think this way. Claiming that you’ll double your capacity is a hypothesis - and a Big Brass Balls hypothesis at that. It needs to be tested, which is exactly the goal of Test #2.

So run a test with a double-sized MVSI and try to run 2*ServerGoal+(a few more) users through. Observe the breaking point of this system to determine what to do next:

The breaking point is more or less 2*ServerGoal

Well, it isn’t going to be more, so we don’t need to worry about that. If it’s close, then you’re in great shape since it appears that your application is scalable.  Don’t fall into the trap of saying that your application is scalable. At this point it only appears scalable. Is scalable is a hypothesis that you’ll need to test when you continue to Test #3.

The breaking point is much less than expected

You’ve got a scalability problem. There can be several reasons why this would occur:

  • There’s a resource that’s creating a bottleneck and won’t let the application scale properly. A good place to start looking is anything related to a database.
  • The application requires some stateful information that’s being stored in a problematic or non-scalable way. For instance, if a server needs to create a session with a client, a load balancer may need to take this into account and create a “sticky” session. If not done properly, problems will occur.
  • There are problems with the configuration or network topology of the underlying server infrastructure.
  • A whole lot of other possible problems …

The point here is that your hypothesis that the application is nominally scalable is just plain not true, and it’s pointless to go any further until this problem gets resolved.

Finding the cause of this problem won’t be easy, but if you do it in a methodical way you’ll get there. Look over the code and instrument it to write data into a log during critical parts of the processing. Consider running this instrumented code against Test #1 (with a true MVSI) and compare the logs against Test #2′s double MVSI system to see how the processing is different. This may lead you directly to the problem.Continue iteratively running Test #2 until you get this to scale properly.

Strap yourselves in guys, next time we go for the gold.

Add to FacebookAdd to DiggAdd to Del.icio.usAdd to StumbleuponAdd to RedditAdd to BlinklistAdd to TwitterAdd to TechnoratiAdd to Yahoo BuzzAdd to Newsvine

Zen and the Art of Facebook Application Load Testing (Part 2)

PersigCycleIn the last post, we discussed Robert Pirsig’s seminal book Zen and the Art of Motorcycle Maintenance and its relation to application load testing. In short, Pirsig discusses that any experiment (or test run in our case) is only a failure when nothing is learned from the outcome. In this post, we’ll talk about some concrete steps to avoid falling into this trap. We’ve successfully used this approach many times in the past when load testing Facebook apps, and hope that you’ll find it useful.

Before Testing

Before running any tests, you’ll need to set some goals. The first and most important goal is the number of simultaneous users the application must be able to service at any time. We’ll call this SimulGoal. If your application is already deployed, you can get a sense of this number from the current statistics. Otherwise, you need to do a little magic mixed with some marketing projections. Every application is going to be different, and there are going to be loads of variables to determine the right number. This is an interesting topic in itself that we hope to cover sometime in the future, but one very rough rule of thumb that you may use is that for an application with a million Monthly Active Users (MAU) you’ll need to be able to service a couple thousand simultaneously (YMMV!!).

The next goal you’ll need to define is the number of simultaneous users that can be serviced by a single application server, we’ll call this ServerGoal. Coming up with ServerGoal can, and has been be done lots of different ways:

  • Bottom up: Figure out the computational power of a single application server. Then, test and determine how much computational power is necessary to process a single transaction. Do some arithmetic, and you’ve got the number.
  • Top down: Find out your budget for server equipment, and determine the number of servers you’ll be able to run for this application. Divide SimulGoal by this number to determine ServerGoal. This may not have any basis in reality, but it’s a start.
  • Typical: Pull a number out of one of your body’s orifices. Note: this is the method that most people use, so don’t be ashamed to admit it.

Note that ServerGoal is a “soft” goal that can and should change as more is learned about how the application responds under load.

Okay, now that you’ve set some goals, let’s start testing:

Test #1

The first thing you’ll need to decide for your first test is how many virtual users it should contain. Of course your instinct is going to tell you to try and run SimulGoal users through a full deployment infrastructure. Resist this instinct as it is a very bad idea. Why? Well, let’s look at the situation through a Zen lens. You’re not really testing any hypothesis. When you run the test, it is sure to fail (because the first one always fails). When it does fail, what have you learned? Nothing – the test has ended as a true failure and you’ve wasted your time.

A much better approach is this: Define a “Minimal Viable Server Infrastructure” (MVSI) which is the smallest infrastructure you can run and still service requests. Often this simply means a single application server with an optional database server if necessary. If you typically use a load balancer, throw one into the mix.

Now, run a test against this MVSI with ServerGoal+(a few extra) virtual users. Our hypothesis will be that when the test reaches ServerGoal users, the application infrastructure will begin to fail. When the test completes, you’ll definitely have learned something, namely ServerReal which is the actual number of simultaneous users a single server can support. What to do next will be determined by the state you’re in:

ServerReal < ServerGoal

This is probably where you’re going to be. You’ll need to rethink how you came up with ServerGoal (e.g. did you use the orifice method?) and perhaps modify it. Maybe ServerReal really is the limit that your application can handle. If you’re okay with that, set ServerGoal = ServerReal and continue onto Test #2.

However, if you’re thinking “Something’s wrong – this server must be able to handle more users than that”, then you’re in for some work. Look through the server logs and try to find some evidence of something the application is doing that isn’t efficient. Look through the code and see if you can find it there. Look at how the servers that run the application are configured, and see if they can be tweaked to get better performance. In short. come up with some idea for how to squeeze more performance out of the system.

Once you do all of that, you now have a hypothesis for how to improve performance. Test this hypothesis by re-running Test #1 and see if anything gets better. Continue this iteration cycle until you get to a value of ServerGoal that just can’t get any better.

ServerReal > ServerGoal

Wow, you’re in really good shape. First, go talk to your developers and thank them for writing such a great application. Next, reset ServerGoal = ServerReal, and if you’re satisfied with that, continue to Test #2, otherwise iterate and try to get even better.

In the next post, we’ll talk about what happens in Test #2 where we’ll determine just how scalable the application is.

Add to FacebookAdd to DiggAdd to Del.icio.usAdd to StumbleuponAdd to RedditAdd to BlinklistAdd to TwitterAdd to TechnoratiAdd to Yahoo BuzzAdd to Newsvine

Zen and the Art of Facebook Application Load Testing (Part 1)

Zen_motorcycleMany years ago when I was an undergrad in college, I had lots of pre-med friends who took Biochemistry. One of their assigned texts was Robert Pirsig’s Zen and the Art of Motorcycle Maintenance, something that didn’t make any sense to me at the time. It was only several years later after I read the book for myself that I understood why this book should be assigned to budding scientists.

Zen tells the story of a motorcycle trip, but at its core, is really about the search for the meaning of quality. It also contains some great discussions about the Scientific Method. You remember the Scientific Method, that process you learned in the seventh grade that you didn’t see any use for it. Well, it turns out that this process has loads of meaning for software testing, and it’s worth looking into.

For me, the most insightful passage of the whole book is this:

The TV scientist who mutters sadly, “The experiment is a failure; we have failed to achieve what we had hoped for,” is suffering mainly from a bad scriptwriter. An experiment is never a failure solely because it fails to achieve predicted results. An experiment is a failure only when it also fails adequately to test the hypothesis in question, when the data it produces don’t prove anything one way or another.

What Pirsig is really getting at here can be applied to software testing, where each test that you run is (or really should be) an experiment. When viewed this way, Pirsig’s words have real meaning. Every test that you run should have a purpose, which is, to test a hypothesis. You should be able to predict the results of the test before you run it, and most importantly, you should always learn something  from the outcome of a test regardless of the results.

Software testing, especially load testing is pretty resource intensive. Each test usually requires someone to run the test, someone to run the computing infrastructure, perhaps a DBA to watch the database (often the first thing to break), a developer or two, and perhaps a few others depending on the application. You need a deployment-quality infrastructure to run the application under test, and a much larger infrastructure to run the load testing tool. Beforehand, you need plenty of planning, and afterwards you need some time to analyze the data. All of this needs coordination, and most importantly time. In short, running a single load test is a fairly complex thing to do, and is pretty expensive. Expensive, not in terms of money, but in time, schedule and focus.

And that’s where Zen comes in. Before you invest that amount of time and effort into running your test, it’s critically important to ensure that you don’t end up like that TV scientist. A test where you don’t learn anything afterwards is a waste. Since load testing is often the last thing done before deployment, it is really important not to be wasteful, and to always complete a test cycle knowing more than when you started.

In the next post, we’ll discuss some concrete strategies you can take to plan your own load tests.

Add to FacebookAdd to DiggAdd to Del.icio.usAdd to StumbleuponAdd to RedditAdd to BlinklistAdd to TwitterAdd to TechnoratiAdd to Yahoo BuzzAdd to Newsvine

WordPress Themes