Zen and the Art of Facebook Application Load Testing (Part 3)

HEB_zen_and_the_art_of_motorcycle_maintenanceIn the last post, we continued the discussion of how Robert Pirsig’s book Zen and the Art of Motorcycle Maintenance can help guide us in some of the decisions we need to make when load testing a Facebook application. We defined some goals that the test should have, and we developed a way to determine ServerGoal, or the maximum number of simultaneous users that can be serviced by a Minimal Viable Server Infrastructure (MVSI) configuration of the test application’s infrastructure.

Once we’ve done that, it’s time to start Test #2 and determine whether the application appears to be scalable.

Run Test #2

In Test #2 we’ll increase the size of the MVSI and see how many virtual users we can push through the system. This can be as simple as using two application servers instead of one and seeing what happens. The natural instinct here is to say “This is a waste of time – doubling my computational power will double my capacity.”. Resist the urge to think this way. Claiming that you’ll double your capacity is a hypothesis - and a Big Brass Balls hypothesis at that. It needs to be tested, which is exactly the goal of Test #2.

So run a test with a double-sized MVSI and try to run 2*ServerGoal+(a few more) users through. Observe the breaking point of this system to determine what to do next:

The breaking point is more or less 2*ServerGoal

Well, it isn’t going to be more, so we don’t need to worry about that. If it’s close, then you’re in great shape since it appears that your application is scalable.  Don’t fall into the trap of saying that your application is scalable. At this point it only appears scalable. Is scalable is a hypothesis that you’ll need to test when you continue to Test #3.

The breaking point is much less than expected

You’ve got a scalability problem. There can be several reasons why this would occur:

  • There’s a resource that’s creating a bottleneck and won’t let the application scale properly. A good place to start looking is anything related to a database.
  • The application requires some stateful information that’s being stored in a problematic or non-scalable way. For instance, if a server needs to create a session with a client, a load balancer may need to take this into account and create a “sticky” session. If not done properly, problems will occur.
  • There are problems with the configuration or network topology of the underlying server infrastructure.
  • A whole lot of other possible problems …

The point here is that your hypothesis that the application is nominally scalable is just plain not true, and it’s pointless to go any further until this problem gets resolved.

Finding the cause of this problem won’t be easy, but if you do it in a methodical way you’ll get there. Look over the code and instrument it to write data into a log during critical parts of the processing. Consider running this instrumented code against Test #1 (with a true MVSI) and compare the logs against Test #2′s double MVSI system to see how the processing is different. This may lead you directly to the problem.Continue iteratively running Test #2 until you get this to scale properly.

Strap yourselves in guys, next time we go for the gold.

Add to FacebookAdd to DiggAdd to Del.icio.usAdd to StumbleuponAdd to RedditAdd to BlinklistAdd to TwitterAdd to TechnoratiAdd to Yahoo BuzzAdd to Newsvine

1 Comment

Other Links to this Post

  1. Zen and the Art of Facebook Application Load Testing (Part 4) | Test Facebook — August 19, 2010 @ 5:26 am

RSS feed for comments on this post. TrackBack URI

Leave a comment

WordPress Themes